Method and apparatus replacing sub-networks within an IC design

ABSTRACT

Some embodiments of the invention provide a method for pre-tabulating sub-networks. This method (1) generates a sub-network that performs a function, (2) generates a parameter based on this function, and (3) stores the sub-network in a storage structure based on the generated parameter. In some embodiments, the generated sub-network has several circuit elements. Also, in some embodiments, the generated sub-network performs a set of two or more functions. Some embodiments store each generated sub-network in an encoded manner. Some embodiments provide a method for producing a circuit description of a design. This method (1) selects a candidate sub-network from the design, (2) identifies an output function performed by the sub-network, (3) based on the identified output function, identifies a replacement sub-network from a storage structure that stores replacement sub-networks, and (4) replaces the selected candidate sub-network with the identified replacement sub-network in certain conditions. In some embodiments, this method is performed to map a design to a particular technology library. Some embodiments provide a data storage structure that stores a plurality of sub-networks based on parameters derived from the output functions of the sub-networks.

CROSS REFERENCE TO RELATED APPLICATIONS

This patent application claims benefit to and is a continuation of theUnited States Patent application entitled “Method and Apparatus forPre-Tabulating Sub-Networks.” having Ser. No. 10/062,017 filed on Jan.31, 2002 now U.S. Pat. No. 7,398,503. This patent application claimsbenefit to and is a U.S. National Stage Application of InternationalApplication PCT/US03/02984 filed on Jan. 1, 2003.

FIELD OF THE INVENTION

The present invention is directed towards method and apparatus forsynthesis.

BACKGROUND OF THE INVENTION

A combinational logic synthesizer produces an efficient circuitdescription for a specific region of an integrated circuit (“IC”). TheIC region can be the entire IC or a portion (i.e., a block) of the IC.An IC or IC block typically performs a set of Booleancombinational-logic functions F that depend on a set of Booleanvariables X. The Boolean function set F typically includes severalfunctions f_(—)1, . . . , f_m, and the Boolean variable set X typicallyincludes several variables X_(—)1, . . . , X_n.

In terms of the IC design, the variable set X includes inputs of the ICregion. Also, the outputs of some or all Boolean functions f_i serve asthe outputs of the IC region. In addition, each function f_i specifies alogic operation that needs to be performed on one or more inputs to thefunction. The output of each Boolean function f_i(X) can be either trueor false.

Each function f_i may be initially given in a variety of formats, suchas register transfer level (RTL) description (e.g., Verilog or VHDLdescription), a Boolean expression, or a technology-level netlist, etc.These description formats are mutually interchangeable, and there arewell-known ways to transform one such description into another.

The “efficiency” of the circuit description produced by a synthesizer isusually measured in terms of the estimated “size” and “depth,” althoughother criteria are also possible. Size and depth are defined with thedesired output format of the description. Two output formats that aretypically used for the IC design are: (1) technology-level output, and(2) intermediate-level output.

A technology-level design is a circuit description that is tied to aspecific technology library, which is typically referred to as a targetlibrary. The circuit elements in a technology-level design can beimplemented in silicon as units with known physical characteristics(e.g., known timing behavior, power consumption, size, etc.), since suchcircuit elements and their relevant logic and physical behavior aredescribed in the target library. Accordingly, for technology-leveloutput, the terms “size” and “depth” usually refers to an actualphysical characteristic of the overall circuit. For instance, “size” canbe measured in terms of the number of circuit elements, the total areaof the circuit elements, the total power consumption, etc., while“depth” can be measured in terms of the circuit's timing behavior, whichtypically relates to the number of stages of the circuit.

An intermediate-level design is a circuit description that is not tiedto a specific technology library. Rather, the circuit description is insome intermediate format that might be mapped onto a specific targetlibrary in a subsequent step. An intermediate-level output can includecircuit elements that are not tied to a specific target library and thatcompute arbitrary, complex logic functions.

As intermediate design elements are not necessarily tied to a directphysical IC implementation, the “size” of an intermediate-level designis usually measured by an estimate of the final area, the total sum ofall variables within the functions performed by the circuit elements inthe design, or some other abstract quantification. Similarly, the“depth” of an intermediate-level design is often abstractly quantified.For instance, it might be measured as the number of stages of thecircuit plus an estimate about the internal depth of each circuitelement. Other more sophisticated estimates are also possible.

The problem of deriving an efficient circuit design has been extensivelystudied, because both the size and speed of an IC directly depend on theefficiency of the circuit design. Three current approaches to theproblem of combinational-logic optimization include: (1) rule-basedtechniques, (2) optimization-by-factoring techniques, and (3) two-levelminimization techniques.

In rule-based systems, the input is typically a technology-leveldescription. The system then iteratively tries to make improvementsaccording to a relatively small fixed set of rules. Each rule specifiessome local configuration and the manner for replacing the configurationwith a different set of circuit elements. Such rules are often definedand handcoded by experts and programmers, although they can sometimes beparameterized by end users of the system. Sets of rules are combined asscenarios or scripts (either by the end user or as templates by humanexperts). A script specifies a number of optimization passes that areapplied sequentially and the subset of rules (and their parameters) thatshould be used during each pass.

In optimization-by-factoring systems, the input and output are typicallyexpressed in terms of intermediate-level descriptions of circuitelements that implement arbitrary Boolean functions. These systemsperform optimization by applying algebraic factoring algorithms that tryto identify common sub-functions (i.e., common factors) in differentparts of the circuit. Instead of realizing such a sub-function multipletimes, the function is extracted (realized separately once) and theresult is fed back to the multiple places where it is needed. Thesesystems also modify the design in other ways, such as collapsing nodes(i.e., merging multiple nodes into one), etc.

Two-level minimization is a special optimization technique for two-levellogic, e.g., for logic functions that are represented as a sum ofproducts. The known algorithms are very powerful and, in part, evenoptimal. The application of two-level minimization is limited, though,because only simple logic functions can be represented efficiently inthat form.

There are also a variety of algorithms and techniques that have beendeveloped for special underlying chip technologies, such as PLA-folding,Look-Up Table optimization for FPGAs, etc. These are highly specificalgorithms that are not suitable for the optimization of more generalcircuits.

Therefore, there is a need for a robust logic synthesizer that does notwork only for simple logic functions or hand-coded functions. Ideally,such a synthesizer would use a rich set of pre-tabulated sub-networks.For such an approach, there is a need for an indexing scheme thatefficiently stores and identifies pre-tabulated sub-networks. Ideally,such an indexing scheme would allow for the efficient storing andidentification of multi-element and/or multi-function sub-networks.

Some current approaches use indexing schemes for mappingtechnology-level designs to a specific technology-level library forsimple circuit elements. Current approaches find all circuit elements inthe library that realize a single-function query. Previous filters werebuilt such that each library circuit element was tested irrespective ofwhether it is a match. These tests were performed by checking severaleasy computable characteristics first (to exclude most possibilitiesfast) and then applying a final test for equality (based on somerepresentation of the logic function).

SUMMARY OF THE INVENTION

Some embodiments of the invention provide a method for pre-tabulatingsub-networks. This method (1) generates a sub-network that performs afunction, (2) generates a parameter based on this function, and (3)stores the sub-network in a storage structure based on the generatedparameter. In some embodiments, the generated sub-network has severalcircuit elements. Also, in some embodiments, the generated sub-networkperforms a set of two or more functions. The generated parameter is anindex into the storage structure in some embodiments. Also, someembodiments generate this parameter based on a symbolic representationof an output of the function performed by the sub-network. The symbolicrepresentation of a function's output is different than the function'sname. Examples of symbolic representation include a binary decisiondiagram, a truthtable, or a Boolean expression. Some embodiments storethe graph structure of each generated sub-network in an encoded manner.

Some embodiments of the invention provide a method for producing acircuit description of a design. From the design, this method selects acandidate sub-network. It then identifies an output function performedby the sub-network. Based on the identified output function, the methodidentifies a replacement sub-network from a storage structure thatstores replacement sub-networks. It then determines whether to replacethe selected candidate sub-network with the identified replacementsub-network. If the method determines to replace the selected candidatesub-network, it replaces this sub-network in the design with theidentified replacement sub-network. In some embodiments, the selectedsub-network has several circuit elements. Also, in some embodiments, theselected sub-network performs a set of two or more functions. Thegenerated parameter is an index into the storage structure in someembodiments.

In some embodiments, this method maps a design to a particulartechnology library. In some of these embodiments, the selectedsub-network can have a directed acyclic graph structure. Also, in someembodiments, the selected sub-network has several output nodes. Inaddition, some embodiments use this method to map a design that is basedon one technology to a design that is based on a second technology.

Some embodiments provide a method for encoding sub-networks that have aset of circuit elements. This method initially defines a plurality ofgraphs, where each graph has a set of nodes. It then specifies differentsets of local functions for each graph, where each set of local functionfor each particular graph includes one local function for each node ofthe particular graph, and the combination of each graph with one of theset of local functions specifies a sub-network. The method stores thegraph and the local functions. For each particular specifiedsub-network, the method stores an identifier that specifies the set ofparticular local functions and the particular graph that specify theparticular sub-network.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth in the appendedclaims. However, for purpose of explanation, several embodiments of theinvention are set forth in the following figures.

FIG. 1 illustrates the software architecture of a logic synthesizer ofsome embodiments.

FIG. 2 illustrates a process that conceptually represents the overallflow of the synthesizer of FIG. 1.

FIG. 3 illustrates an example of a circuit network.

FIG. 4 illustrates an example of a combinational-logic sub-network ofthe circuit network of FIG. 3.

FIG. 5 illustrates a block diagram of a network database that is used bythe logic synthesizer of FIG. 1.

FIG. 6 illustrates a process that a query manager performs to respond toa received query for a set of combinational-logic functions.

FIG. 7 illustrates the components of an indexer that is used in someembodiments.

FIGS. 8 and 9 conceptually illustrate two processes that an indexermanager of the indexer of FIG. 7 performs in some embodiments of theinvention.

FIGS. 10-13 illustrate an example of a matching determination performedby the query manager.

FIG. 14 illustrates a process that the query manager performs todetermine whether a replacement sub-network matches a candidatesub-network.

FIGS. 15 and 16 illustrate several database tables that are used in someembodiments.

FIG. 17 conceptually illustrates a process that some embodiments use toretrieve pre-tabulated sub-networks from a database.

FIGS. 18-20 illustrate an example of a graph encoding scheme that isused in some embodiments of the invention.

FIG. 21 presents a process that conceptually illustrates severaloperations performed by a data generator in some embodiments of theinvention.

FIG. 22 illustrates a more specific process that a data generatorperforms in some embodiments of the invention.

FIG. 23 illustrates a pivot node of a graph.

FIG. 24 illustrates a three-node sub-network.

FIG. 25 illustrates a process for performing technology mappingaccording to some embodiments of the invention.

FIG. 26 illustrates a computer system that can be used in conjunctionwith some embodiments of the invention.

DETAILED DESCRIPTION OF THE INVENTION

In the following description, numerous details are set forth for purposeof explanation. However, one of ordinary skill in the art will realizethat the invention may be practiced without the use of these specificdetails. In other instances, well-known structures and devices are shownin block diagram form in order not to obscure the description of theinvention with unnecessary detail.

I. Terminology

The definitions of several terms in this document are provided below.

IC region refers to the entire IC or a portion (or block) of the IC.

Logical description of a design refers to the logical representation ofthe design. Examples of logical description include an RTL description(e.g., a Verilog, VHDL, HDL, or BLIFF description), acombinational-logic expression (e.g., a Boolean expression), etc.Circuit description of a design refers to the circuit representation ofthe design. Examples of circuit description include technology-leveldesigns (also called bound circuit networks), intermediate-level designs(also called unbound circuit networks), etc.

IC designs use a variety of circuit elements. Such elements typicallyhave inputs and outputs. These circuit elements include combinationaland sequential elements. A combinational element receives one or morediscrete-value inputs and generates one or more discrete-value outputsthat depend on the values of the received inputs and on one or morelogic functions performed by the combinational element. The outputs ofcombinational elements do not depend on prior states of the element.Boolean combinational elements are one example of combinationalelements. In Boolean combinational elements, the inputs and outputs canhave one of two values. Each Boolean combinational element computes oneor more Boolean functions. Examples of Boolean combinational circuitelements include low-level elements (e.g., inverters, NAND gates, NORgates, AND gates, OR gates, XOR gates) or high-level elements (e.g.,multiplexors, adders, etc.).

Sequential elements are elements that generate outputs that depend onone or more prior states of the element (e.g., one or more prior valuesof their inputs and/or outputs). Some sequential elements store valuesthat they receive at their inputs at some time and release these valueslater (depending on a clock or control signal, etc.). Examples of suchelements are latches, registers, flip-flops, RAM, ROMs, etc.

A circuit element that is tied to a specific technology library can beimplemented in silicon as a unit with known physical characteristics(e.g., known timing behavior, power consumption, size, etc.). Dependingon the underlying technology and design style, a circuit element maycorrespond to a standard cell, a custom block, an elementary cell of aGate Array or FPGA, etc.

A technology library (also called a target library) includes standardcircuit elements that can be used in a particular technology toimplement an IC. Such a library typically includes description of thelogical and physical characteristics of its circuit elements. A net istypically defined as a collection of circuit-element pins that need tobe electrically connected. A net may be seen as a physical network ofwiring (e.g., metal or polysilicon wiring) that connects pins or apurely abstract connection that propagates a logic signal (e.g., abinary signal) from one set of pins to another set of pins. A netlist isa list of nets.

A technology-level design is a circuit description of an IC region. Thisdescription is tied to a specific technology library (i.e., it includesonly circuit elements from a specific technology library). Anintermediate-level design is also a circuit description of an IC region.However, this description is not tied to a specific technology library(i.e., all or some of its circuit elements are not part of anytechnology library). Rather, the circuit description is in someintermediate format that might be mapped onto a specific target libraryin a subsequent step. In other words, an intermediate-level outputincludes circuit elements that are not tied to a specific target libraryand that compute arbitrary, complex logic functions.

A circuit network refers to a circuit description of multiple IC's, anentire IC, or a portion of an IC. A circuit network can be either atechnology-level design or an intermediate-level design. A circuitnetwork that is a technology-level design is a bound circuit network(i.e., its circuit elements are “bound” to a specific technologylibrary). On the other hand, a circuit network that is anintermediate-level design is an unbound circuit network (i.e., itscircuit elements are not “bound” to a specific technology library).

A circuit element in a circuit network is also referred to as a node inthe network. Each combinational node (i.e., each combinational-logiccircuit element) of a circuit network performs a set ofcombinational-logic functions, each of which is referred to as a localfunction of the combinational-logic node. When the circuit network isnot bound to a specific technology library, the networks circuitelements are typically not associated with physical characteristics(such as size, timing behavior, etc.), and the local functions of thecombinational-logic nodes may be arbitrary complex logic functions.

II. Top Level Architecture and Flow

FIG. 1 illustrates the software architecture of a logic synthesizer 100of some embodiments of the invention. The synthesizer 100 works oncircuit networks that may or may not be bound to specific technologylibraries. This synthesizer takes as input a circuit network design 135and optimizes this design with respect to one or more objectives.

Specifically, this synthesizer's generates a modified bound or unboundcircuit network that receives the same set of inputs and produces thesame set of outputs as the original circuit network supplied to thesynthesizer. However, the modified circuit network is superior withrespect to the objective or objectives used to optimize the design. Forinstance, the modified network might have fewer nodes and/orintermediate inputs, or it might be better with respect to physicalcharacteristics (such as size, timing, etc.).

In some embodiments, a user can define the objectives that thesynthesizer optimizes. Also, in some embodiments, the synthesizer 100optimizes the received combinational-logic network design for one ormore objectives in view of one or more constraints. Like the objectives,the constraints can relate to the size and depth of the design.

In the embodiment described below, the synthesizer receivescombinational-logic-network designs as its inputs. In several examplesdescribed below, the combinational-logic network designs areBoolean-network designs. One of ordinary skill will realize that in someembodiments the combinational-logic designs can include multi-valuecircuit elements that receive inputs and/or generate outputs that canhave more than two discrete states.

Also, in other embodiments, the synthesizer 100 receives the startingdesign in another format. For instance, the synthesizer 100 mightreceive a logical representation of the starting design (e.g., receivesthe starting design in logical formats such as Verilog, VHDL, HDL,BLIFF, etc.). In some of these embodiments, the synthesizer uses knowntechniques to convert received logical representations to circuitnetworks that it then optimizes.

As shown in FIG. 1, the synthesizer 100 includes a network data storage105, a global optimizer 110, and one or more costing engines 130. Thenetwork data storage 105 stores and manages numerous (e.g., severalmillion) combinational-logic sub-networks that may or may not be tied tospecific target libraries.

In some embodiments, the sub-networks stored in the data storage 105 arepre-computed in advance. In some of these embodiments, a data generator115 pre-tabulates these networks in an automated fashion. Also, in theembodiments described below, this generator is not used duringoptimization. The network data storage 105 supports queries foralternative implementations of sub-networks in the design. In theembodiments described below, the network data storage is organized as arelational database. However, other embodiments might use differentdatabases (e.g., object-oriented databases), or might use other datastorage structures (e.g., data files, etc.), for the data storage 105.

The global optimizer 110 controls the synthesizer's operation by (1)selecting for replacement a candidate sub-network from the design 135,(2) querying the data storage 105 to identify one or more sub-networkalternatives for the selected candidate sub-network, (3) analyzing thealternatives, and (4) based on the analysis, determining whether toreplace the selected candidate sub-network in the design 135 with anidentified alternative.

In some embodiments, the optimizer 110 iteratively performs these fouroperations until it reaches one or more criteria for stopping. Inaddition, the optimizer uses one or more of the costing engines 130 tocompute costs that will allow the optimizer to analyze the alternativesub-networks retrieved from the data storage. The optimizer 110 can usesuch costs to assess the alternatives and/or the overall design withrespect to one or more objectives and/or constraints.

FIG. 1 illustrates three examples of costing engines 130, which are atiming engine 120, an area engine 125, and a power engine 140 The timingengine 120 computes timing costs for the optimizer in order to guide theoptimizer towards generating designs for faster chips. Specifically,when this engine is used, the optimizer informs the timing engine aboutan alternative that it is considering, and the timing engine in returnsupplies the optimizer with information about how the alternative wouldaffect the timing behavior of a particular sub-network, the receivedcircuit network, and/or a larger design containing the circuit network.For instance, in some embodiments, the timing engine computes theexpected time differential at a certain output if the alternative isselected. In some embodiments, the timing engine uses a user-definedtiming model that is appropriate for a certain technology.

The area engine 125 computes and returns an area estimate, which guidesthe optimizer towards reducing the area of the design. Specifically,when the area engine 125 is used, the optimizer informs this engineabout an alternative that it is considering. In turn, the area enginesupplies the optimizer with information about how this would affect thearea of a particular sub-network, the received circuit network, and/or alarger design containing the circuit network.

The power engine 140 estimates the change of the power consumption whenchoosing an alternative. Power consumption of an IC is often one of theimportant factors in designing an IC. The timing engine, area engine,and power engine can be implemented based on any of the known techniquesin the art. Other costing engines may also be used when the optimizerneeds to consider other objectives and/or constraints.

FIG. 2 illustrates a process 200 that conceptually represents theoverall flow of the synthesizer 100 of FIG. 1 in some embodiments of theinvention. In some embodiments, the optimizer 110 directs this overallflow. The synthesizer 100 performs this process for a circuit network135 that may or may not be bound to a specific technology library. FIG.3 illustrates an example of a portion 300 of the circuit network 135. Asshown in this figure, the circuit network has several Boolean circuitelements 305-335 (also called nodes) and each circuit element performs aparticular function (also called the local function).

In the embodiments described below, the received circuit network 135 isexpressed in terms of a linked graph data structure that has one nodefor each circuit element of the network. Each node is either acombinational or sequential logic node. In some embodiments, each nodeincludes a flag that specifies whether it is a combinational-logic node.For each of its combinational-logic nodes, the received network alsospecifies a set of local functions. In these embodiments, a reducedorder binary decision diagram (ROBDD) expresses each local function of acombinational-logic node.

An ROBDD is a graph data structure that includes nodes and edges, wherethe nodes represent input or output variables, and the edges representtrue or false Boolean results. In some embodiments, an ROBDD is used toexpress each output function of a combinational-logic node. Accordingly,a combinational-logic node has several ROBDD's associated with it whenthe node has several output bits. For instance, a three input adder hastwo ROBDD's, one for each of its two outputs. One of ordinary skill willrealize that other embodiments might use other types of BDD's (such asMDD's) or other function representations (such as truthtablerepresentations).

As shown in FIG. 2, the process 200 initially traverses the receivedcircuit network to identify any multi-output combinational-logic node(such as a three-input adder), i.e., to identify any combinational-logicnode that performs two or more local functions. If the process finds anysuch node, it replaces this node with a set of single-outputcombinational-logic nodes.

The process 200 can perform this replacement by using any knowntechniques in the art. For instance, it can replace the multi-outputcombinational-logic node with several single-output nodes, where eachsingle-output node performs one local function. For each particularoutput function performed by the multi-output node, the processinitially tries to identify a sub-network (of one or more nodes) storedin its data storage 105 that performs the particular output function;the process for identifying whether the data storage stores asub-network is described below by reference to FIGS. 5-17. If theprocess 200 cannot identify a pre-tabulated sub-network for a particularoutput function of the multi-output node, the process decomposes thisfunction into simpler functions, and then identifies sub-networks storedin the data storage that perform the simpler functions. The process canuse any known decomposition process, such as the Shannon expansion,which is discussed in Giovanni De Micheli, Synthesis and Optimization ofDigital Circuits, McGraw-Hill, 1994, or the Reed-Muller expansion, whichis described in “Some classical mathematical results related to theproblems of the firmware/hardware interface,” by T. C. Wesselkamper,Proceedings of the eighth annual workshop on Microprogramming, Chicago,1975 (available in the ACM Digital Library). Once the process 200identifies a sub-network for each output function of the multi-outputnode, the process replaces the multi-output node with the identified setof sub-networks. Specifically, in the received circuit network, theprocess replaces the multi-output node with several replacement nodesthat represent the nodes of the identified set of sub-networks. For eachreplacement node, the process specifies the node's local function interms of its ROBDD.

After replacing each multi-output node with a set of single-outputnodes, the process in some embodiments associates the set ofsingle-output nodes (e.g., links them, identifies their association in alist, or uses some other association technique). This association canthen be used during the selection of candidate sub-networks forreplacement (at 210, which is described below), in order to identify andselect the entire node set that represents a multi-output node. Thisselection would allow the entire node set of the multi-output node to beoptimized alone or in conjunction with other combinational-logic nodes.The collective optimization of the entire node set of a multi-outputnode is especially beneficial when the data storage stores a desiredimplementation of the multi-output node.

One of ordinary skill will realize that although the process 200decomposes a multi-output node into a set of single-output nodes, otherembodiments might address multi-output nodes differently. For instance,some might simply not select such nodes for optimization. Others mighthandle them without any decomposition: For example, the process 200might try to find alternatives (at 220) based solely on thecombinational-logic functions of the candidate sub-network identified at215. Accordingly, other embodiments may use any kind of representationof the circuit network—including multi-output nodes—so long as thecombinational-logic functions realized by a sub-network can be computed.

After 205, the process 200 selects (at 210) acombinational-logic-sub-network S in the circuit network 135. Theselected sub-network S is a candidate sub-network for replacement. Thisselected sub-network is also a combinational-logic network, which meansthat each of its nodes is a combinational-logic node.

In some embodiments, the process 200 selects (at 210) the candidatesub-network S through a random growth process. Numerous suchrandom-growth processes are known in the art. Several such processesrandomly pick a combinational-logic node in the circuit network and thenrandomly pick other combinational-logic nodes that are connected to thefirst selected combinational-logic node or the subsequently selectedcombinational-logic nodes, until they reach a certain sub-networkcomplexity (e.g., until the total number of input variables to thesub-network falls within a certain range). At any point in the growth ofthe candidate sub-network that encounters a sequential-logic node in thecircuit network, the growth is terminated at that point (i.e., thatpoint becomes an input or output of the candidate sub-network). Once thegrowth is terminated at a particular combinational-logic node, thegrowth might be continued at other nodes of the specified candidatesub-network until the desired level of complexity is achieved.

In some instances, the sub-network identified at 210 has multipleoutputs (i.e., the identified sub-network performs multiple functions).In some embodiments, the random-growth process ensures that the selectedsub-network S has at least one output dependent on all the inputs of thesub-network. Such a criteria is used in these embodiments in order tofacilitate the database indexing, as further described below.

FIG. 4 illustrates an example of a combinational-logic sub-network 400that is selected from the circuit network that is partly illustrated inFIG. 3. The sub-network 400 illustrated in FIG. 4 has four inputs X0-X3,three circuit elements 305-315, and three outputs Y0-Y2. The threecircuit elements are an AND gate 305, a NAND gate 310, and an OR gate315. The combinational-logic expressions for the three functionsrepresented by the three outputs of the selected sub-network are:Y0=X0·X1;  (1)Y1=X2⊕X3;  (2)Y2= (X0·X1)·(X2⊕X3) (X0·X1)·(X2⊕X3).  (3)In the sub-network 400, the output Y2 depends on all the inputs X0-X3 ofthe sub-network.

After 210, the process 200 identifies (at 215) a set ofcombinational-logic functions F realized by the selected candidatesub-network S. The identified set F includes only one logic function F0when the selected sub-network S has only one output. On the other hand,the identified set F includes several logic functions F0 . . . FM whenthe selected sub-network S has several outputs Y0 . . . YM.

The set of output functions F can come from two different types ofcircuit elements. The first type is a circuit element that does notreceive the output of any other circuit elements in the candidatesub-network S selected at 210. In other words, all the inputs of thefirst-type circuit element are inputs to the selected candidatesub-network S. The second type is a circuit element that receives theoutput of at least one other circuit element in the selected candidatesub-network S. Some or all of the second-type element's inputs are fromother circuit elements in the selected candidate sub-network S.

The candidate sub-network's output function that comes from a first-typecircuit element is just the local function of the first-type circuitelement. For instance, in the example illustrated in FIG. 4, thesub-network outputs Y0 and Y1 come from first-type elements 305 and 315.As recited by equations (1) and (2) above, these outputs are just thelocal functions of the AND gate 305 and the OR gate 315.

On the other hand, the function that is supplied at the output of asecond-type circuit element has to be derived. Such a function isderived from the local functions of the second-type circuit element andthe circuit element or elements whose output or outputs the second-typecircuit element receives. For instance, in the example illustrated inFIG. 4, the sub-network outputs Y2 come from second-type elements 310.As recited by equations (3), this output is derived from the localfunctions of NAND gate 310 and the local functions of the AND gate 305and the OR gate 315.

As mentioned above, each output function is represented by an ROBDD inthe embodiments described below. In some of these embodiments, theprocess 200 uses the BuDDy® software package to identify the ROBDDrepresentation of each function in the set of combinational-logicfunctions F realized by the candidate sub-network S selected at 210.

Specifically, the process 200 traverses the graph data structure of theselected candidate sub-network to examine each node in thecombinational-logic sub-network. The process 200 starts with thefirst-type of nodes (i.e., the nodes that do not receive the output ofany other nodes). For each of these nodes, the process provides theBuDDy package with the node's local function (in form of an ROBDD) andits inputs in the received sub-network. The BuDDy package thenidentifies the node's output function, which is essentially its localfunction after it has been modified to account for the representation ofits inputs in the combinational-logic sub-network.

After identifying with all the first-type nodes, the process stepsthrough the second-type of nodes (i.e., the nodes that receive theoutput of at least one other node) in a manner that ensures that itselects each second-type node after it has identified the ROBDD of theset of nodes that supply their output to this node. For each selectedsecond-type node, the process 200 provides the BuDDy package with thenode's local function (in ROBDD format), the node's inputs in thesub-network, and the ROBDD's of the set of nodes that supply theiroutputs to the node. The BuDDy package then identifies the node's outputfunction (in an ROBDD format) based on the received information.

Information about the BuDDy package can be obtained from JoinLind-Nielsen, Computer Systems Section at the Department of InformationTechnology, Technical University of Denmark, who may be contacted byemail or through the University website.

One of ordinary skill will realize that other embodiments might identifythe realized set of combinational-logic functions differently. Forinstance, some embodiments might use other software packages, such asthe CUDD package provided by the University of Colorado. Yet otherembodiments might not use the ROBDD representation for thecombinational-logic functions. Some embodiments might use the truthtablerepresentation, or some other representation. Still other embodimentsmight use other BDD variants.

After 215, the process 200 queries (at 220) the network data storage 105to determine whether it has stored one or more alternative sub-networksthat implement the combinational-logic function set identified at 215.The network data storage converts the identified combinational-logicfunction set to a parameter based on which one or more alternativesub-networks might have been stored in the data storage. Accordingly,the network data storage uses this parameter to try to identify one ormore alternative sub-networks in the data storage.

In the embodiments described below, the data storage 105 is a database,and the parameter generated by the network database is a set of integerindices into database tables that store the pre-tabulated sub-networks.The network database uses these indices to retrieve any alternativesub-network C that is stored in the database according to the generatedset of indices. The operation of the network database in response to thequery at 220 will be further described below by reference to FIGS. 5-17.One of ordinary skill will realize that other embodiments might usedifferent storage structures (e.g., data files) and/or different storageparameters (e.g., string indices).

If the process determines (at 220) that the network database 105 doesnot store any alternative sub-network C that implements thecombinational-logic function set identified at 215, it transitions to240, which will be described below. On the other hand, if the processidentifies one or more such alterative sub-networks, it uses (at 225)one or more of the costing engines 130 to compute costs for thecandidate sub-network selected at 210 and for each alternativesub-network identified at 220.

The costing engines can estimate actual physical costs when thesynthesizer 100 is optimizing a design that is bound to a specifictechnology. For instance, the area engine 125 might estimate a candidateor alternative sub-network's total area, the number of circuit elements,etc. The timing engine 120 might estimate the sub-network's timingbehavior, delay, etc. The power engine 140 might estimate the powerconsumption of a sub-network.

On the other hand, when the synthesizer is working on an unbound design,the optimizer 110 uses one or more costing engines that compute moreabstract costs of the candidate or alternative sub-networks. Forinstance, an area engine 125 might measure the “size” of an unboundsub-network by some estimate of its final area, by the number of itslowest-level and intermediate-level inputs, by the number of variableswithin the functions performed by the sub-network's circuit elements, orby some other abstract quantification. Another costing engine mightestimate the “depth” of a sub-network as the number of stages of thesub-network plus an estimate about the internal depth of each circuitelement. The optimizer might also use such abstract costing engines forbound designs.

In some embodiments, the optimizer uses just one of the costing engines(e.g., uses only the timing engine 120 or the area engine 125) tocompute only one cost (e.g., to compute only a timing cost or an areacost) for each candidate or alternative sub-network. In otherembodiments, the optimizer uses several costing engines (e.g., uses boththe timing and area engines) to compute several costs for eachsub-network at issue. In some of these embodiments, the costs generatedfor each sub-network are combined in a weighted or unweighted fashioninto a single cost for the sub-network, while in other embodiments theyare not.

Based on the computed costs, the process 200 next determines (at 230)whether any identified alternative sub-networks is acceptable. Thisdetermination depends on the type of optimization that the optimizer 110performs. For instance, certain types of optimizations (such as localoptimization) do not accept alternative sub-networks that have worsecost than the original sub-network identified at 210. Other optimizationtechniques (such as simulated annealing) do accept alternativesub-networks that have worse calculated costs, but decrease the costpenalty (i.e., decrease the number of bad exchanges) that they arelikely to accept as the number of iterations of process 200 increases.

The determination (at 230) of whether any identified alternative isacceptable might be based solely on optimizing an objective function, orit might be based on optimizing an objective function within a certainconstraint. In addition, the objective function might be associated withonly one computed cost or with a (weighted or unweighted) combination ofseveral computed costs. Similarly, the constraints, if any, might beassociated with only one computed cost or with several computed costs.Examples of objective functions include an area costing function and atiming costing function, both of which can be computed at 225. Anotherexample of an objective function can be a costing function that producesa weighted combination of an area cost and a timing cost. Such afunction can be computed at 225, or it can be computed at 230 based oncosts computed at 225. Examples of constraints can include maximum area,time, depth constraints, and/or maximum power consumption.

If the process 200 does not find any alternative sub-network acceptableat 230, it transitions to 240, which will be described below. On theother hand, if the process finds one or more alternative sub-networksacceptable at 230, it then exchanges (at 235) the sub-network identifiedat 210 with one of the acceptable alternative sub-networks. If more thanone alternative sub-networks are acceptable, the process 200 randomlyselects one of the acceptable sub-networks. One of ordinary skill willrealize that other embodiments might choose the best sub-network withrespect to the costing, or might choose randomly one out of the k bestones, where k is a user-defined parameter. After 235, the processtransitions to 240.

At 240, the process determines whether it has reached a criterion forstopping. If not, it transitions back to 210 to identify anothercandidate sub-network S and to repeat the subsequent operations for thissub-network. Otherwise, the process 200 ends.

In different embodiments of the invention, the optimizer uses differentcriteria for stopping. For instance, this criterion might relate to thenumber of iterations performed by the process 200 without identifying abetter alternative sub-network. Alternatively, it might simply relate tothe overall number of iterations. The criterion for stopping at 240depends on the type of optimization that the optimizer 110 performs. Onetype of optimization is simulated annealing, which is described in manypublications, such as Nahar, et. al., Simulated Annealing AndCombinatorial Optimization, Proceedings of the 23 Design AutomationConference, 1986. The pseudo code for the optimization process 200 whenit uses a version of simulated annealing is as follows:

S:= S_0 (S_0 is initial solution) T := T_0 (T_0 is the initialtemperature) Inner_Loop_Count := Outer_Loop_Count := 0 Inner_Loop_Max :=i_0 (maximum number of iterations of inner loop) Outer_Loop_Max := i_max(maximum number of iterations of outer loop) Repeat     Repeat      Increment Inner_Loop_Count and Outer_Loop_Count by 1;      Identify candidate sub-network, CanS, for replacement; //See 210of FIG. 2       Identify set of functions F; //See 215 of FIG. 2;      Call query manager for replacement candidates; //See 220 of FIG.2;       If any replacement returned,           Cost each replacement,//See 225 of FIG. 2;           If at least one replacement acceptable,then randomly select one of the         acceptable replacements asNewS//See 230 of FIG. 2;           If (h(NewS) ≦ h(CanS)) or (Random <exp( (h(CanS)−h(NewS)) / T) ),         //See 230 of FIG. 2;                Then replace CanS with NewS //See 235 of FIG. 2;    Until (Inner_Loop_Count = Inner_Loop_Max) or       (Outer_Loop_Count = Outer_Loop_Max)     T := alpha * T;Inner_Loop_Max = beta * Inner_Loop_Max; Reset   Inner_Loop_Count UntilOuter_Loop_Count = Outer_Loop_Max //See 240 of FIG. 2;

In this pseudo code, T_(—)0, i_(—)0 i_max, alpha, beta are parametersfor the simulated annealing algorithm. These parameters might bespecified by a user or set by the optimizer. In some embodiments, alphaequals 0.98, and beta equals 1.1. The selection of the annealingparameters is well studied. One scheme for specifying these parametersis disclosed in “A Comparison of Annealing Techniques for AcademicCourse Scheduling,” by M. A. Saleh Elmohamed, et al., published at 2ndinternational conference, PATAT97. See also, e.g., “Simulated Annealingand Combinational Optimization,” by Surendra Nahar, et al., Universityof Minnesota, 23 Design Automation Conference, pp. 293-299. Also,several software packages are available for determining the parametersfor simulated annealing. One such package is ASA, written by LesterIngber, who may be contacted through the internet.

The evaluation function h( ) is performed by one of the costing engines,such as the timing engine. Random is a random value within the interval[0,1) that is generated within each pass of the inner return loop. Also,in the example above, the stopping criterion is the overall number ofiterations, which is recorded by the Outer_Loop_Count.

In some embodiments, after the process 200 terminates, the globaloptimizer maps the optimized circuit network into a technology-leveldescription, if the circuit network is not at that level already. Inother words, if the optimized circuit network is not bound to a specifictechnology library, the global optimizer 110 in some embodimentsconverts the network into a technology-level network by using standardtechniques for performing such conversions. In some of theseembodiments, the global optimizer performs this technology mapping byrepeating the process 200 except this time it uses a database that istied to the target technology library (i.e., uses a database that storesonly sub-networks that are made up only of circuit elements from thetarget technology library). Such technology mapping will be furtherdescribed below in Section V.

The following discussion initially describes in Section III theoperation of the network database 105 in response to search query at220. Several processes for offline generation of sub-network candidatesand the storage of these candidates in the database 105 are thendescribed in Section IV. These processes are performed by the datagenerator 115 and the network database 105.

III. Network Database

The network database 105 is designed to store a large number (e.g.,millions) of sub-networks, and to support queries for the storedsub-networks. In the embodiments described below, this database has thefollowing features. First, it stores each sub-network completely alongwith full information about the logic function or functions performed bythe sub-network. Second, in the embodiments described below, each storedsub-network includes one or more circuit elements, each of which can beindependently analyzed (e.g., independently examined, replaced, etc.) bythe global optimizer. In other words, the circuit elements of eachsub-network do not need to be treated as one entity but rather can betreated as separate elements. In other embodiments, a stored sub-networkmight include several circuit elements that are part of an unbreakableblock that the synthesizing optimizer needs to analyze as one entity.

Third, the stored sub-networks are machine generated in order to exploitthe space of all existing networks up to a given size. For instance, insome embodiments, the stored sub-networks cover any implementation ofany logic function that can be realized with networks up to a givencomplexity (e.g., up to the complexity of seven 2-input circuitelements). This is far superior to an approach that only storesalternative implementations of functions that are derived from expertknowledge.

Fourth, the network database uses new encoding schemes for thesub-networks that make it possible to store large number of sub-networksefficiently. Fifth, this database uses an indexing scheme thatefficiently uses memory and CPU time. This indexing scheme transformsthe problem of multi-valued logic function matching into a relationaldatabase system that is based on sets of integer primary and secondaryindices. Accordingly, the network database can use any method fromstandard relational database systems for fast search and retrieve.Fifth, the database may reside either on disk or in physical memory orpartially on both media (e.g., by using a filein/fileout swappingmechanism).

FIG. 5 illustrates a block diagram of the network database 105 in someembodiments. As shown in this figure, the database includes a querymanager 505, a network encoder 510, an indexer 515, a table manager 520,and database tables 525.

A. Query Manager

From the global optimizer 110, the query manager 505 receives a queryfor one or more sub-networks that perform a set F of combinational-logicfunctions. The received set of combinational-logic functions mightinclude one function F_(—)1 or several functions F_(—)1, . . . ., F_M.In the embodiments described below, each combinational-logic function inthe received set is in a ROBDD format, although one of ordinary skillwill realize that other formats (such as truth tables, symbolic forms,other types of BDD'S, etc.) can be used to represent the receivedcombinational-logic functions.

In response to the received query, the query manager 505 interacts withthe network encoder 510, indexer 515, and table manager 520 to try toidentify a set of sub-networks that are stored in the database table 525and that match the query. In other words, the query manager tries toidentify pre-tabulated sub-networks that compute (i.e., realize) all thefunctions in the received set of functions.

FIG. 6 illustrates a process 600 that the query manager performs torespond to a received query for a set F of combinational-logicfunctions. As shown in FIG. 6, the query manager initially passes (at605) the received set of input functions to the indexer 515. Thisindexer translates each of the functions into an integer index into thedatabase tables 525.

In other words, the indexer 515 generates a set of one or more indices Ifrom the set of one or more functions F. In the embodiments describedbelow, the indexer converts a single-function query F_(—)1 into a singleindex I_(—)1 into the network database, and this single index identifiesa set of one or more sub-networks that realize the function of thequery. For a multi-function query, this indexer selects one of thefunctions as a pivot function, specifies an input variable order basedon this pivot function, and generates an index for this pivot function.When there are more than one pivot functions in a query, the indexerrandomly selects among the pivot functions. Also, when there are severalinput variable orders (also called input variable configurations below)that can be specified for a pivot function, the indexer randomly selectsamong the viable input variable configurations. Based on the specifiedinput variable order of the pivot function, the indexer generates anindex for each of the non-pivot functions of the query. The indexer 515is described further below by reference to FIGS. 7-9.

Once the indexer returns (at 605) the set of indices I, the querymanager 505 passes (at 610) this set to the table manager 520. The tablemanager 520 interacts with the database tables 525 that storepre-tabulated sub-networks that are sorted based on set of indices. Inthe embodiments described below, the database tables use a relationaldatabase scheme to store all sub-networks together with their associatedindices. In some embodiments, the database tables also store for eachstored sub-network some additional data, such as actual or estimatedsize or speed of the sub-network. In addition, the sub-networks arestored in these tables in an encoded form.

Based on the received set of indices I, the table manager 520 tries toretrieve a set of pre-tabulated replacement sub-networks from thedatabase tables 525. If the table manager successfully retrievesreplacement sub-networks from the database tables, the table managermight also retrieve additional parameters (such as estimated or actualsize or speed) that are stored in the database for each replacementsub-network. The global optimizer can use such parameters to decidewhether to replace the selected candidate sub-networks with replacementsub-networks retrieved from the database. The operation of the databasetables 520 and the table manager 515 are further described by referenceto FIGS. 15-17.

After 610, the query manager determines (at 615) whether the tablemanager returned any replacement sub-network for the set of indices itreceived at 610. If not, the process transitions to 655, which will beexplained below. Otherwise, the query manager selects (at 620) onereplacement sub-network retrieved by the table manager at 610.

The replacement sub-networks retrieved by the table manager are in anencoded form. Accordingly, the query manager directs (at 625) thenetwork encoder 510 to decode the replacement sub-network selected at620. The network encoder decodes this sub-network into (1) a linkedgraph data structure that has one node for each circuit element of thesub-network and one edge for each connection between the circuitelements of the sub-network, and (2) a local function for each node inthe graph structure. Also, each local function is represented by anROBDD in some embodiments. The network encoding and decoding used insome embodiments are further described in detail in Section III.D.

After decoding the replacement sub-network selected at 625, the querymanager determines (at 630) whether the replacement sub-network selectedat 620 matches the candidate sub-network selected at 210. In theembodiments described below, the replacement sub-network matches thecandidate sub-network only if the replacement sub-network performs allthe output functions of the candidate sub-network (i.e., all thefunctions in the set of functions F identified at 215 for the candidatesub-network) for a particular input variable configuration of thereplacement sub-network. This determination of the query managerrequires understanding of the operation of the indexer 515. Accordingly,this determination will be described below in sub-section III.B.4, afterthe indexer's operation is described below in sub-section III.B.1-3.Also, sub-section III.B.4 explains why the query manager even needs tocheck whether the replacement sub-network performs all the outputfunctions of the candidate sub-network.

If the query manager determines (at 630) that the replacementsub-network selected at 620 does not match the candidate sub-networkselected at 210, the query manager transitions to 640, which will befurther described below. However, if the selected replacementsub-network matches the selected candidate sub-network, the querymanager adds (at 635) the matching replacement sub-network to a set ofmatching replacement sub-networks. From 635, the query managertransitions to 640.

At 640, the query manager determines whether it has examined all thesub-networks retrieved by the table manager at 610. If not, the querymanager returns to 620 to select another retrieved sub-network that ithas not yet examined and to repeat its above-described subsequentoperations for this sub-network.

When the query manager determines (at 640) that it has examined all thereplacement sub-networks retrieved by the table manger, the querymanager determines (at 645) whether at least one of the retrievedreplacement sub-networks matched the selected candidate sub-network(i.e., determines whether the set of matching replacement sub-networksincludes at least one replacement sub-network).

If so, the query manager then returns (at 650) a set of the matchingreplacement sub-networks to return to the global optimizer, and thenends. In some embodiments, the query manager receives from the globaloptimizer parameters (e.g., user-defined parameters) that specify themaximum number of sub-networks to return to the global optimizer. Inthese embodiments, the query manager uses (at 650) these parameters toselect the sub-networks that it wants to return to the optimizer fromthe set of matching sub-networks. Alternatively, in other embodiments,the query manager (1) might randomly select one or more matchingsub-networks to return, (2) might select all matching sub-networks, or(3) might select all matching sub-networks that do not have more than aspecified number of nodes.

On the other hand, if the query manager determines (at 645) that noretrieved replacement sub-network matched the selected candidatesub-network, the query manager transitions to 655. As mentioned above,the query manager also transitions to 655 from 615 when the tablemanager fails to retrieve any sub-network at 610. From either 615 or645, the query manager ends up at 655 when it fails to identify amatching pre-tabulated sub-network for the candidate sub-networkselected at 210.

This failure might be because the query manager cannot confirm that anyretrieved replacement sub-network performs the set of output functionsof the selected candidate sub-network. Alternatively, this failure mightbe due to the indexer's choice for the input variable configuration. Asmentioned above, when the indexer has to generate multiple indices(i.e., when the candidate sub-network performs several outputfunctions), the indexer selects an input variable configuration as thebasis for the indices. This selection is further described below inSection III.B. If the indexer's selection does not match the inputvariable configuration selected during the pre-tabulation of thesub-networks in the database, the set of indices generated by theindexer during the run-time optimization operation might not returnmatching replacement sub-networks (i.e., might not return anyreplacement sub-network, or might return replacement sub-networks thatdo not satisfy the two matching criteria of 630).

Accordingly, the query manager performs 655, 660, and 665 to reduce thepossibility that the failure to match is because of the indexer's choiceof input variable configuration. Specifically, at 655, the query managerdetermines whether the query was a multi-function query. If not, thefailure to find a match is not because of the choice of input variableconfiguration. Accordingly, the process informs (at 670) the optimizer110 that it could not find a matching sub-network for the selectedcandidate sub-network (i.e., could not find any viable pre-tabulatedreplacement sub-network that performs the same function as the candidatesub-network) and then ends its operation.

On the other hand, if the process determines (at 655) that the functionis a multi-function query, the query manager determines (at 660) whetherthe indexer specified the existence of more than one input variableconfiguration when it returned the set of indices at 605. When theindexer has not specified this, the failure to identify a matchingsub-network is not due to the indexer's choice of input variableconfiguration. Accordingly, the query manager transitions from 660 to670, where it informs the optimizer 110 that it could not find amatching sub-network for the selected candidate sub-network. The querymanager then ends its operation.

Alternatively, when the indexer specified (at 605) more than one inputvariable configuration, the query manager transitions to 665 from 660.At 665, the query manager determines whether it has tried a sufficientnumber of times to get from the indexer a set of indices that results ina matching sub-network. In some embodiments, the query manager uses afunction T to specify the number of times that it should try to obtainsets of indices from the indexer. This function returns an integer thatis dependent on the number N of input variable configurations specifiedby the indexer at 605. It should be noted that the indexer might returna different number N each time at 605 for a different pivot function.

If the function T equals N when it receives N (i.e., if T(N)=N), thenthe expected number of times that the query manager tries each possibleinput ordering is once, given that the indexer randomly chooses an inputvariable configuration each time. To speed up the process, someembodiments defineT(N)=min[10, (N/constant)],where the constant is at times set to something in the range from 5-10.Yet other embodiments (1) specify a likelihood “p”, 0<p<=1, for thequery manager and the indexer to find an input configuration thatmatches the configuration used during the pre-tabulation, and then (2)define T(N)=N*p.

Still other embodiments might have the query manager and indexerdeterministically search through possible input variable configurationsin hope of finding an input variable configuration used duringpre-tabulation to store a particular sub-network. However, such anapproach is more time-consuming. This time consumption is problematicwhen there are no matching sub-networks in the database for any of theinput variable configurations. It is also especially problematic whenthe number of input configurations is relatively large (e.g., fortotally symmetric pivot functions like a n-way AND).

If the query manager determines (at 665) that it has not asked theindexer to generate sets of indices for more than the number of timesspecified by function T, it transitions back to 605 to direct theindexer again to generate a set of indices. As mentioned above andfurther described below, the indexer generates the set of indices byrandomly selecting a viable pivot function and a viable input variableconfiguration for this pivot function. Consequently, the next generatedset of indices might result in the identification of a set ofreplacement sub-networks that match the selected candidate sub-network.

If the query manager determines at 665 that it has unsuccessfully trieda sufficient number of times to identify a matching replacementsub-network, it informs (at 670) the optimizer 110 that it could notfind a matching sub-network for the selected candidate sub-network(i.e., could not find any viable pre-tabulated replacement sub-networkthat performs the same set of functions as the candidate sub-network,and then ends its operation.

B. Indexer

In order to search for sub-networks that realize one or more logicfunctions, it is necessary to have an efficient indexing scheme forstoring these sub-networks. This scheme must be capable of returningflawlessly, or with a high probability, all sub-networks that match aspecific query.

The indexer 515 facilitates such an indexing scheme. Specifically, theindexer maps combinational-logic functions to database indices. As forrelational databases, the term index denotes an elementary data type(e.g., integers or strings) that can be stored, sorted, and compared. Inthe embodiments described below, each index is an integer.

The embodiments described below uses a direct indexing scheme that canidentify single-output or multi-output sub-networks (i.e., sub-networksthat realize one function or multiple functions). Each time the indexer515 receives a single-function or multi-function query from the querymanager, this indexer converts each function in the received query to aninteger index into the database tables 525. The generated set of indicescan then be used to search the network database (like an ordinaryrelational database) for all entries (i.e., all sub-networks) that areassociated (i.e., related) with each index in the generated set.

FIG. 7 illustrates the components of the indexer 515 in someembodiments. As shown in this figure, this indexer includes a translator705, an input order identifier 710, a hasher 715, and an index manager720. This indexer can be used during the pre-tabulation of sub-networks,or during run-time optimization of a network. In each instance, theindexer generates a set of indices for a received set of functions. Thegenerated set of indices is used to store the generated sub-networksduring pre-tabulation and is used to retrieve pre-tabulated sub-networksduring optimization.

The manager 720 directs the flow of operations of the indexer. Thismanager interacts with the query manager 505, the translator 705, theinput order identifier 710, and hasher 715. The manager 720 uses thetranslator 705 to convert each function's ROBDD representation into anintermediate integer representation. In the embodiments described below,this intermediate representation is a truth table representation, asfurther described below. The manager 720 uses the input-orderingidentifier 710 to specify one or more orders for the input variables ofthe candidate sub-network selected at 210. The manager 720 then uses thehasher 715 to reduce the number of bits that represent each function'sindex.

FIGS. 8 and 9 conceptually illustrate two processes that the indexmanager 720 performs in some embodiments. This manager uses the process800 of FIG. 8 to identify the index for the function of asingle-function query. This manager uses the process 900 of FIG. 9 togenerate multiple indices for a multi-function query. Specifically, asfurther described below, the process 900 for a multi-function query (1)identifies one of the query functions as a pivot function, (2) uses theprocess 800 to identify the index for the designated pivot function andidentify an input variable configuration, and then (3) identifies anindex for each non-pivot function of the query based on the identifiedinput variable configuration.

1. Computing the Index for a Single-Function Query or a Pivot Index fora Multi-Function Query

As mentioned above, the index manager 720 performs process 800 of FIG. 8to identify the index for a single-function query. The manager alsoperforms this process as part of the process 900 of FIG. 9 to identifythe pivot index for a multi-function query. Whenever the process 800 iscalled, it is supplied with a function and an operational parameter. Theoperational parameter specifies whether the process 800 should select aninput variable configuration randomly or deterministically. Theoperational parameter specifies a random operation during optimization,while it specifies a deterministic operation during pre-tabulation.

As shown in FIG. 8, the process 800 initially directs (at 805) thetranslator 705 to generate a truthtable representation for the functionon which it is operating. As mentioned above, the query manager suppliesfunctions in a ROBDD format to the indexer, although in otherembodiments the query manager might provide the functions in anotherformat.

The truthtable representation is a binary bit string where each bitcorresponds to an output of the function for one set of input values.Table 1 below provides examples of truthtables for two functions. Thefirst function G is a two-input AND function. As shown in the tablebelow, the truthtable for this function is 0001. This truthtable is fourbits long because there is one output bit for each set of input values.The second function H is a two-input AND that has its second inputinverted. As shown in Table 1 below, the truthtable for this function His 0100.

TABLE 1 Input 1 Input 2 G H False False 0 0 True False 0 1 False True 00 True True 1 0

If the order of inputs 1 and 2 were reversed, the truthtable forfunction G would remain the same, but it would change to 0010 forfunction H. As further discussed below, the change in the truthtable forfunction H reflects the fact that the truthtable representations offunctions are dependent on the order of the input signals.

After generating (at 805) the truthtable of the function, the process800 directs (at 810) the input-ordering identifier 710 to identify acanonical representation of the truthtable for the function F. Acanonical representation is a fixed, unique representation, chosen fromthe set of possible representations, that a particular algorithm willalways pick. In the embodiments described below, this canonicalrepresentation is the smallest truthtable of F from the set oftruthtables for all variations of input variable ordering. For instance,the above-described function H has two input variable configurations,since it has two inputs and the order of these two inputs can beswitched. The truthtables for these two input variable configurationsare 0100 and 0010. Taking the least significant bit of a truthtable tobe the right most bit, then the canonical truthtable representation forfunction H is 0010.

In some embodiments, the input ordering identifier identifies thecanonical truthtable representation for the function F by having abranch-and-bound technique use the truthtable representation identifiedat 805 as an initial starting point to search through the space of thefunction's truthtables without examining all input variableconfigurations. One such branch-and-bound technique is described in“Boolean Matching for Large Libraries” by Uwe Hinsberger and ReinerKolla, DAC98, Jun. 15-19, 1998.

At 810, the input-ordering identifier also specifies a set of one ormore input variable configurations. Each input variable configuration inthis set results in the canonical representation identified at 810. Inother words, the truthtable representation of F is identical for eachconfiguration in the specified set of input variable configurations.

One of ordinary skill will realize that other embodiments do not convertthe BDD representation of the supplied function to a truthtablerepresentation in order to perform canonicalization. These embodimentsdirectly perform the canonicalization operation at 810 on the BDDrepresentation of the supplied function. Jerry Burch and David Long,Efficient Boolean Function Matching, Proc. ICCAD 1992 describe obtainingsemi-canonical BDD representations. The process 800 applies thecanonicalization operation for the truthtable representation since thisapproach is fast, especially for functions with a small number ofinputs, as it can be efficiently implemented by using fastmachine-implemented bitwise operations.

After identifying the canonical truthtable representation and the set ofinput variable configurations that lead to this representation, theprocess 800 selects (at 815) one of the identified set of input variableconfigurations either deterministically or randomly based on theoperational parameter that it receives. During run-time generation ofindices for a candidate sub-network, the operational parameter directsthe process 800 to select an input variable configuration randomly.

However, the operational parameter directs the process 800 to select aninput order deterministically during pre-tabulation. The input orderidentifier returns (at 810) the set of input variable configurationsalways in a particular order. When the process 800 operatesdeterministically during pre-tabulation, the process 800 always selectsthe same input configuration in the returned set of input configurationsas the input configuration. For instance, in some embodiments, thisprocess always selects the first input configuration in the returned setas the designated configuration. The operation of the indexer 515 duringpre-tabulation is further described below in Section IV.

After selecting an input variable configuration at 815, the process 800directs (at 820) the hasher 715 to map the resulting canonicaltruthtable representation of the function F to an index into the networkdatabase. The truthtable representation does not serve as a good indexinto the database as it can be a long bitstring in many instances.Accordingly, the hasher maps the truthtable representations to hashedindex values that are shorter than the truthtable representations.Section III.B.3 describes the hashers used by some embodiments of theinvention. After 820, the process 800 returns (at 825) the hash valueidentified at 820 and then ends.

2. Computing Indices for a Multi-Function Query

FIG. 9 conceptually illustrates a process 900 that identifies theindices for a multi-function query. The index manager 720 directs thisprocess whenever it receives a multi-function query. As shown in FIG. 9,the index manager 720 initially selects (at 905) one of the functions inthe query as a pivot function. When the query has multiple pivotfunctions (i.e., when the candidate sub-network selected at 210 hasmultiple pivot functions), the index manager 720 randomly selects (at905) one of the pivot functions in the query. In the description below,the selected pivot function is designated as the first function F_(—)1in order to simplify the discussion.

Index manager 720 will use the pivot function to identify an order forthe input variables to the sub-networks. Accordingly, the embodimentsdescribed below select as the pivot function a function that isdependent on all the input variables, since such a pivot function couldbe used to impose an ordering for all input variables.

As mentioned above, the process 200's selection (at 210) of thecandidate sub-network ensures that this sub-network has at least onefunction that is dependent on all the input variables. Also, in theembodiments described below, the sub-network pre-tabulation comportswith such a selection of the pivot function, since the pre-tabulationprocess pre-tabulates only multi-function sub-networks that have atleast one pivot function.

In other embodiments, the selection of the candidate sub-network at 210and/or the pre-tabulation of the sub-networks do not need to ensure theexistence of at least one pivot function (i.e., of an output functionthat is dependent on all input variables). For instance, the process 900could deal with candidate sub-networks that do not have an outputfunction dependent on all input variables, by terminating its search forreplacement sub-networks. Alternatively, the process 900 could deal withmulti-function queries that do not have pivot functions by selecting (at905) a function that is dependent on the most number of inputs and thenusing a particular scheme to specify the position of the othersub-network inputs. For instance, if the selected candidate sub-networkhas five inputs, and its best output function depends on only 4 inputs,the process 900 might select the best output function and position thefifth input as the leftmost input. Accordingly, the existence of thepivot function is not crucial, but rather is employed by the embodimentsdescribed below in order to improve the hit rate of the optimizer.

After 905, the process 900 computes (at 910) the index I_(—)1 for theselected pivot function F_(—)1. The index manager uses process 800 ofFIG. 8 to compute this index value I_(—)1. As described above, thecanonicalization procedure at 810 returns the canonical truthtablerepresentation of the function F_(—)1 plus an input ordering. Thereturned input variable order is used to generate the indices for theremaining non-pivot functions in the query.

Specifically, for each particular non-pivot function, the index manageruses (at 915) the translator 705 to generate the truthtablerepresentation for the non-pivot function, based on the input variableordering returned at 910. At 915, the index manager also uses the hasher715 to generate an index for each non-pivot function from the function'sgenerated truthtable representation. At 920, the index manager 720 thenreturns the generated set of indices to the query manager 505.

In sum, the indexing scheme used by the indexer is as follows in someembodiments of the invention. The indexer converts a single-functionquery into a single index into the network database, and thissingle-index identifies a set of one or more sub-networks that realizethe function of the query. For a multi-function query, this indexerselects one of the functions as a pivot function, specifies an inputvariable order based on this pivot function, and generates an index forthis pivot function. Based on the specified order, it then generates anindex for each of the non-pivot functions of the query.

As described below, the pre-tabulation process associates eachparticular combinational-logic function with exactly one pivot index,which is used whenever the particular function serves as pivot for acertain query. Each function's pivot index (also called primary index)is the index that indexer generates during pre-tabulation by using aprocess similar to the process 800 except that during pre-tabulation theindex generation process deterministically selects one of the inputvariable configurations as opposed to the random selection at 815. Inaddition, during pre-tabulation, each combinational-logic function isassociated with a number of secondary indices. Secondary indices of afunction F are used to number (or identify) the cases when a query ismade for some set {G, . . . , F , . . . } where a different function Gserves as pivot function and therefore determines the input ordering.

3. Hasher

In some embodiments, the hasher 715 uses a hashing function that isdescribed in “An Optimal Algorithm for Generating Minimal PerfectHashing Functions,” by Zbigniew J. Czech, et al., Information ProcessingLetters, 43(5); 257-264, October 1992 (“Czech's paper”). This hashingfunction is referred to as the “minimal” hashing function. Czech's paperdescribes in detail how to generate this hashing function.

In the current context, the minimal hashing function is generated duringthe pre-tabulation of the sub-networks. Specifically, duringpre-tabulation, the data generator generates numerous sub-networks. Foreach sub-network, the data generator has the indexer compute thecanonical truthtable representation of each pivot function of thesub-network. When the sub-network performs more than one function, theindexer also specifies an input variable configuration based on thecanonical truthtable representation, and generates a truthtablerepresentation for each non-pivot function of the sub-network. Based onthe computed truthtable representations, the data generator uses ahashing function creator (not shown) to associate each of theprecomputed truthtable representations with a unique-index value andgenerates a hashing function that realizes this association. Thecreation of the minimal hashing function from a defined (static) hashtable is described in detail in Czech's paper. During optimization, thehasher uses the generated hashing function to generate an index for eachtruthtable that the hasher receives from the index manager 720.

This minimal hashing function has the advantage that it takes only asmall amount of memory while still performing a single evaluation of thehashing function in linear time dependent on the size of the truthtablethat is input to the hashing function. However, the tradeoff is thatthis function has the property that it will always return a value forany truthtable representation. For a truthtable that is equal to one ofthe precomputed tables, the hashing function returns always its uniqueindex value. For any other input this function returns also some integerbut it is difficult to check whether the returned value is a valid valuefor a valid input or just an arbitrary output for an invalid input.Consequently, because of this hashing function, the query manager has todetermine at 630 whether a retrieved replacement sub-network performsthe set of output functions of the selected candidate sub-network.

Other embodiments might use other hashing approaches. For instance, someembodiments might use more traditional hashing techniques, such as thosedescribed in Cormen, Leiserson, Rivest, and Stein, “Introduction toalgorithms,” Second edition, MIT Press, 2001, Chapter 11. Traditionalhashing approaches specify values for only valid keys (i.e., for onlyvalid truthtable representations). For instance, some traditionalhashers support the query “retrieve a value stored in hash table for aparticular key” and the query “is anything for a particular key storedin the hash table.” Other traditional hashers return some dedicated NULLvalue for any query of type “retrieve value stored in hash table forsome key” when no element is stored for the key.

Traditional hashers have the advantage that they can detect that thereis no hashed value for a particular truthtable. Accordingly, in suchcircumstances, the indexer would not return indices for functions thatdo not have an associated hashed value and instead would notify thequery manager that there is no associated index for the particularfunction. Because of this, the query manager would not retrieveirrelevant sub-networks from the database and, therefore, would not needto perform the matching determination at 630. On the other hand, atraditional hashing approach requires the storage of all valid keyswithin the hash-table. This, in turn, would require the storage of alllengthy truthtable representations, which would require a large amountof memory.

4. Input Variable Correction by the Query Manager

As mentioned above, the query manager determines (at 630) whether aretrieved replacement sub-network matches the candidate sub-networkselected at 210. The replacement sub-network matches the candidatesub-network only if the replacement sub-network performs all the outputfunctions of the candidate sub-network (i.e., all the functions in theset of functions F identified at 215 for the candidate sub-network) fora particular input variable configuration of the replacementsub-network.

FIGS. 10-13 illustrate an example of such a matching determination. FIG.10 illustrates a candidate sub-network 1005 from a network 1000. Thecandidate sub-network includes two AND gates 1010 and 1015, and a NANDgate 1020. The candidate sub-network receives inputs x_(—)0, x_(—)1, andx_(—)2 in the order illustrated in FIG. 10. When the output functionsare specified by a BDD package (such as BuDDy described above), theinput variables have a natural ordering as they are just integer indicesprovided by the BDD package.

In the example illustrated in FIG. 10, the candidate sub-network 1005has only one output function F. Based on this output function, theindexer generates an index, and the table manager retrieves areplacement sub-network. FIG. 11 illustrates the circuit schematic forsuch a replacement sub-network after it has been decoded. (Thisschematic is only for explanatory purposes since after the replacementsub-network is decoded at 625, the sub-network is represented by a graphand a local function for each node in the graph.)

As shown in FIG. 11, the replacement sub-network 1105 has two gates andthree inputs I_(—)0, I_(—)1, and I_(—)2. One gate is a two-input ANDgate 1110 with an inverted input. The other gate is a two-input AND gate1115. The inputs have a generic ordering. Specifically, when thereplacement sub-network is decoded, it is a graph made of a linked nodelist. If the replacement sub-network has n inputs then the decodednetwork structure has a “dummy” node 1120 for each input. Internal nodesthat are fed by one of these network inputs have a directed arc from thenetwork input to one of their node inputs. The dummy nodes are stored inan array so they have a generic ordering just by array index.

The matching operation at 630 tries to identify an input configurationfor the replacement sub-network 1105 so that this sub-network's set ofoutput functions includes the output function of the candidatesub-network selected at 210. In other words, the matching operationtries to identify a reordering of the array of “dummy” nodes and acorrelation of each dummy node's input with one of the input variables(x_(—)0 to x_(—)2) that enables the replacement sub-network to performthe output function set of the candidate sub-network 1005.

As shown in FIG. 11, because of its input configuration, the outputfunction of the replacement sub-network 1105 has a different truthtablefrom the output function F of the candidate sub-network 1005. FIG. 12,however, illustrates an input configuration for the replacementsub-network 1105 that produces the same output function (i.e., has thesame output truthtable) as the candidate sub-network's output functionF. FIG. 13 illustrates how the configuration illustrated in FIG. 12 canbe inserted in the network 1000.

FIG. 14 illustrates a process that the query manager 505 performs at 630to determine whether a replacement sub-network matches the candidatesub-network selected at 210. As shown in this figure, the process 1400initially identifies (at 1405) all pivot functions of the replacementsub-network. In other words, it identifies all pivot functions in thereplacement sub-network's function set G {G_(—)0, . . . , G_n}, whereeach function is computed with respect to the input ordering after thedecoding of the sub-network at 625.

The process then selects (at 1410) one of the identified pivotfunctions. For the selected pivot function, the query manager has theindexer compute a set of indices deterministically (i.e., performing anindex-generation operation that always produces the same set of indicesfor the same pivot function and the same set of non-pivot functions).

To perform this deterministic operation, the process 1400 initially hasthe indexer deterministically identify (at 1415) an input variableconfiguration P that leads to the canonical truthtable representation ofthe pivot function. To identify such a configuration, the index manager720 (1) has the translator 705 generate a truthtable representation ofthe pivot function, and then (2) has the input order identifier 710identify a set of input configurations that lead to the smallest-valuedtruthtable representation of the pivot function. The input orderidentifier always returns the set of input variable configurations in aparticular order. The index manager always selects the same inputconfiguration in the returned set of input configurations as the inputconfiguration P identified at 1415. For instance, in some embodiments,the index manager always selects the first input configuration in thereturned set as the designated configuration P.

This deterministic operation of the indexer during run-time optimizationworks in conjunction with its deterministic operation during thedatabase generation, as further described below. Also, thisdeterministic operation during input-configuration matching is differentfrom the above-described randomized operation of the process 800 duringthe generation of a set of indices for a set of functions of thecandidate sub-network selected at 210. In its randomized operation, theprocess 800 randomly selects (at 815) an input variable configurationfrom the set of viable input configurations for a particular function.

After deterministically identifying (at 1415) an input variableconfiguration P, the process 1400 readjusts (at 1420) the identifiedinput configuration based on the configuration used at 605 to generatethe set of indices for the candidate sub-network selected at 210. Forinstance, assume that the set of output functions of the candidatesub-network had an initial configuration R, and that the indexerselected (at 815) an input configuration R′ that produced the canonicalrepresentation of a particular function. If an operation Q has to beperformed on R to obtain R′, then the process 1400 obtains (at 1420) anew configuration by applying the inverse of the operation Q to theinput configuration P identified at 1415 (i.e., apply Q⁻¹(P)).

The process 1400 then identifies (at 1425) the set of functions Hperformed by the replacement sub-network based on the input orderingidentified at 1420. In some embodiments, the process specifies (at 1425)each function in terms of its ROBDD. Also, in some embodiments, theidentified set of output functions includes an output for each node ofthe replacement sub-network. The output function at a particular node'soutput is the particular node's local function when the particular nodedoes not receive the output of any other node in the replacement networkgraph. Alternatively, when the particular node receives the output ofanother node or other nodes in the replacement sub-network, an outputfunction at the particular node's output can be derived from theparticular node's local function and the local function of each nodewhose output the particular node receives.

After 1425, the process then determines (at 1430) whether the set offunctions H identified at 1425 includes the set of functions F that wereidentified for the candidate sub-network at 215. If so, the replacementsub-network matches the candidate sub-network for the inputconfiguration specified at 1425. Accordingly, the process 1400 specifies(at 1435) a match and the input configuration resulting in this matchand then ends.

On the other hand, if the process determines (at 1430) that the set offunctions H identified at 1425 does not include the candidatesub-network's set of functions F, the process determines (at 1440)whether it has examined all the pivot functions identified (at 1405) forthe replacement sub-network. If not, the process transitions back to1410 to select another pivot function, and performs the subsequentoperations for this function. Otherwise, the process 1400 specifies (at1445) that the replacement sub-network does not match the candidatesub-network and then ends.

C. Database Tables and Table Manager

Once the indexer 515 returns a set of indices to the query manager, thequery manager directs the table manager 520 to retrieve all sub-networksthat are associated with the returned set of indices. In response, thetable manager searches the database 525 and returns a set ofsub-networks. This set is an empty set when the database 525 does notstore any sub-network that is associated with the received set ofindices. When the set of identified sub-networks is not empty, thesub-networks in this set are in an encoded form, and the table managerreturns these sub-networks in this form to the query manager.

1. Database Tables

In the embodiments described below, the database 525 is a relationaldatabase, and the table manager 520 is the querying engine of thisdatabase. FIGS. 15 and 16 conceptually illustrate the database schemaused by the database 525. One of ordinary skill will realize that thesefigures simply conceptually illustrate the database design for someembodiments of the invention, and that other embodiments of theinvention might use other database designs.

FIGS. 15 and 16 illustrate five database tables. These tables include(1) the pivot table 1500 and secondary table 1505, which are illustratedin FIG. 15, and (2) the sub-network table 1510, the graph table 1515,and the function table 1520, which are illustrated in FIG. 16.

As shown in FIG. 15, the pivot table 1505 includes a row for each pivotindex 1525, while the secondary table includes a set of one or more rowsfor each pivot index 1525. In addition, each particular pivot table rowin the pivot table specifies first row 1530 and last row 1535 in thesecondary table 1505 for the pivot index of the particular pivot-tablerow. During pre-tabulation, all the rows in secondary table 1505 thatare related to the same pivot index are arranged next to each other in aparticular order (e.g., sorted by increasing secondary index stored ineach secondary table row). Accordingly, a particular pivot table rowneeds to describe only an interval of rows within the secondary table(ie., needs to identify only first and last secondary table rows) ratherthan enumerating all rows explicitly.

The rows of the secondary table 1505 have variable lengths. Hence, eachrow includes a field 1540 that specifies the row's length. In addition,each row of the secondary table 1505 has a field 1545 for specifying aparticular secondary index associated with a particular pivot index. Asecondary table row might store a null value in its secondary indexfield, as further described below.

The secondary table 1505 also expresses the relation betweenpre-tabulated sub-networks and the pivot and secondary indices.Specifically, each particular secondary table row has a set of fields1550 that specify a set of one or more network indices for each pair ofpivot and secondary indices that the particular secondary table rowstores. The set of network indices within a single row are sorted in aparticular order (e.g., by increasing network indices) duringpre-tabulation. Each set of network indices can include one or moreindices.

After identifying a range of rows for the pivot index of a set ofdatabase indices I received from the query manager, the table managertries to identify a secondary table row in the identified range for eachsecondary index in the received set of database indices. If the tablemanager is successful, each network index that is stored in all theidentified secondary table rows specifies a potentially matchingsub-network that the table manager should return to the query manager.

Each network index that is stored in a secondary table row specifies arow in the network table 1510, which is illustrated in FIG. 16. Each rowin the network table 1510 corresponds to a particular sub-network. Inthe embodiments described below, each sub-network is specified by (1) agraph having one or more nodes, and (2) a set of functions that includesa local function for each node of the graph. Accordingly, each row inthe network table specifies a graph-table index 1555 and a set offunction-table indices 1560.

A graph-table index 1555 identifies a graph-table row that stores theencoded graph 1565 for the sub-network associated with a network indexof the network table row. As further described below in Section III.D,each encoded graph 1565 specifies in an encoded manner (1) one or morenodes of the sub-network, and (2) the connections between these nodes.Also, FIG. 16 illustrates each graph-table row in the graph table 1515to include a set of fields 1570 that store additional attributes of thegraph stored in that row. Examples of such attributes include the actualor estimated size or speed of the sub-network represented by the graph.Typically, such attributes can be derived from the structure of thegraph, but they can be stored in the database to speed up the operationof the optimizer.

The function table 1520 stores the local functions 1575 of the graphnodes. In some embodiments, the functions are stored in an ROBDD formatin the function table. The function table is indexed by the functiontable indices 1560 that are stored in this table and in the networktable 1510. More specifically, each function-table index 1560 in anetwork table row corresponds to a particular node of a graph 1565 thatis indexed in the graph table 1515 by the network table row'sgraph-table index. A network table row stores its function-table indicesin a particular manner that corresponds to the ordering of the nodes.For example, for a multi-function set, the first function indexcorresponds to the first-node's function, the second function indexcorresponds to the second-node's function, etc. For its correspondinggraph node, a function-table index specifies a function 1575 in thefunction table.

Although FIG. 16 illustrates only one function table, other embodimentsuse multiple function tables. For instance, some embodiments might haveone function table for all two-input functions, one function table forall three-input functions, one function table for all four-inputfunctions, and one function table for five or more input functions. Fora particular function index of a particular node, the table manager insome of these embodiments identifies the particular node's number ofinputs and then uses the particular function index to identify thefunction in the function table for the identified number of inputs.

The schema described above allows fast search for matching networks. Inother words, no table needs to be fully scanned. Instead, this schemaprovides direct access to pivot rows, a binary search for secondaryrows, a linear scan for the computation of the intersection of thesecondary rows, and direct access to the functions, graphs, and networksstored in the function table, graph table, and network table.

2. Table Manager

Given a set of database indices I, which may include one or more indicesI_(—)1, . . . , I_n, the table manager has to return a graph and a setof functions for each sub-network SNW, where (1) each sub-network SNW isrelated to the pivot index of the received set, and (2) the sub-networkSNW and the pivot index are related to all secondary indices in thereceived set. In other words, the table manager has to return allsub-networks that are associated with the pivot index and all secondaryindices of the received database-index set.

This task is a standard function that is supported by any relationaldatabase management system, such as those offered by Oracle Corp. orInformix, Inc. Therefore, the table manager can be directly realizedwith any commercial or non-commercially available relational databasesystem.

FIG. 17 conceptually illustrates a process 1700 that the table manager520 uses in some embodiments to retrieve pre-tabulated sub-networks(i.e., pre-tabulated encoded graphs and functions) from the database525. The table manager performs this process each time it receives a setof database indices I, which might include one index or several indices.

As shown in FIG. 17, the table manager initially determines (at 1705)whether the pivot table has a row for the pivot index of the receivedset of database indices. If not, the table manager informs (at 1710) thequery manager that there is no matching sub-network (i.e., the databasedoes not store a sub-network that is related with all the indices of thequery) and then ends.

Otherwise, the process accesses (at 1715) the row in the pivot table1500 that is for the pivot index of the received set of databaseindices. When the received database-index set includes only one index,this set's pivot index is its one and only index. On the other hand, thepivot index of a multi-index set is the index that the process 900specified at 905 and 910.

The table manager accesses (at 1715) the pivot-table row of the set'spivot index in order to identify a range of rows in the secondary table1505. This range of rows specifies all the secondary indices that mightbe associated with the pivot index of the received database-index set.The table manager can search the pivot table 1500 in constant time sincethe pivot table is sorted such that its row numbers are equal to the“Pivot_Index” of this row.

At 1720, the table manager selects a secondary index in the receiveddatabase-index set. It then determines (at 1725) whether a row in theidentified range has a secondary index that matches the selectedsecondary index. If not, the table manager then informs (at 1710) thequery manager that there is no matching sub-network and then ends.

On the other hand, if the process identifies (at 1725) a row in theidentified range that has a secondary index that matches the selectedsecondary index, it retrieves (at 1730) the set of network indices fromthe selected row. Because the rows within the secondary table are sortedby secondary indices, the table manager's search of this table can bedone efficiently as a binary search. Also, when the receiveddatabase-index set includes only one index, a secondary table rowmatches the received database-index set if it specifies the receivedset's pivot index and a null for the secondary index.

Next, at 1735, the process determines whether it has searched for allsecondary indices in the received database-index set. If not, theprocess transitions back to 1720 to select another secondary index inthis set. Otherwise, the process determines (at 1740) whether thereceived database index set includes more than one secondary index. Ifnot, the process transitions to 1755, which is described below.

However, if the received database index set includes more than onesecondary index, the process cross compares (at 1745) all the sets ofnetwork indices retrieved at 1730 to identify each network index that isin all sets of network indices retrieved at 1730. The table manager canperform this cross-comparison in linear time (i.e., in an amount of timethat scales linearly with the number of network indices) because thenetwork indices are sorted in a particular order.

Next, the process determines (at 1750) whether this cross-comparisonleads to an empty set. If so, the table manager then informs (at 1710)the query manager that there is no matching sub-network and then ends.Otherwise, the process transitions to 1755. For each network index thatthe process identifies to be in all the sets of network indicesretrieved at 1730, the process (at 1755) identifies graph- andfunction-table indices from the network table. As mentioned above, eachnetwork index specifies a graph-table index and a set of function-tableindices. At 1755, the process (1) uses each identified graph-index toretrieve an encoded graph from the graph table, and (2) uses the set offunction indices associated with the identified graph-index to retrievea set of functions for the nodes of the retrieved graph. The tablemanager then returns (at 1760) the retrieved sets of graphs andfunctions to the query manager.

D. Sub-Network Encoding and Decoding

Once the table manager returns a set of graphs and functions to thequery manager, the query manager determines (at 615) whether the tablemanager returned any sub-network. If so, the query manager selects (at620) one of the returned sub-networks and directs (at 625) the networkencoder 510 to decode the selected sub-network.

In order to store a large number of sub-networks in a database, someembodiments use a compact encoding of the sub-networks. The embodimentsdescribed below use an encoding scheme that has three levels ofencoding.

1. Graph and Function Tables

The first level of encoding (which was described above by reference toFIG. 16) specifies each sub-network by (1) a graph having one or morenodes, and (2) a set of functions that includes one local function foreach node of the graph. This manner of specifying sub-networks exploitssimilarities between different sub-networks.

Specifically, each sub-network can be described by its structure and bythe set of functions performed by its node or nodes. Differentsub-networks might have similar structures or might have nodes thatperform similar functions. Consequently, the encoding scheme (1) storesthe structural and functional attributes separately, and then (2)describes each sub-network by reference to the stored structural andfunctional properties.

In the embodiments described below, the structure of each sub-network isdescribed in terms of a directed acyclic graph. The nodes of such agraph represent the circuit elements of the sub-network, and thedirected edges correspond to the connection between the sub-network'scircuit elements.

Because of the offline computation of the database, all graphs thatoccur as network structures within the database are known in advance.Moreover, many of these graphs are isomorphic, i.e., they are identicalwith respect to a certain numbering of the nodes. In fact, the number ofdistinct non-isomorphic graphs is very small compared to the totalnumber of sub-networks in the database.

Accordingly, in some embodiments, the graph table 1515 illustrated inFIG. 16 stores all non-isomorphic graphs up to a certain size. SectionIV below describes techniques for generating a table of all smallernon-isomorphic graphs up to a certain size. Each generated graphstructure is stored as an entry in the graph table 1515. In someembodiments, entries in this table are numbered (indexed) sequentiallyfrom 0, . . . , n.

Also, in some embodiments, the function table 1520 stores all localfunctions that can occur in any of the pre-computed sub-networks. As allpossible local functions are known beforehand, and their number tends tobe small, such a table can be easily generated during the pre-tabulationprocess.

For a database that is bounded to a specific technology library, thelocal functions are taken from the specific technology library (i.e.,they correspond to all logic functions that can be computed by a singlecircuit element in the library). Some technology libraries contain fewerthan 256 different logic functions that can be implemented by a singleblock. Therefore, for such a library, an index 1560 into table 1520 canbe implemented as a single byte. A further reduction can be achieved byusing several function tables to store combinational-logic functionswith the same number of inputs separately, as described above.

For a database that is not bound to a specific technology library, thefunction table can include local functions from one or more technologylibraries and/or additional abstract functions that are not from anyparticular technology library. Adding additional functions to thefunction table increases the size of the database (i.e., increases thenumber of sub-networks specified in the database). However, someefficiency can be realized by using several function tables to store thefunctions according to their number of inputs.

2. Encoding Each Graph

The second level of encoding relates to the encoding of the graphstructures. When the number of graphs is relatively small (e.g.,10000-50000), any reasonably sparse compression encoding of the graphstructures can be used. Some embodiments use the following schema:

-   -   Graph_Encoding={Node_1_Encoding} . . . {Node_N_Encoding},        where    -   Node_J_Encoding={Node_Identifier}{Edge_1_Encoding} . . .        {Edge_M_Encoding},        where    -   Edge_I_Encoding={Edge_Identifier}{Node_X_Index}.

In other words, this schema defines each encoded graph in terms of oneor more encoded nodes (i.e., in terms of one of more Node_J_Encoding's).Each encoded node is defined in terms of an identifier (Node_Identifier)and one or more encoded edges (i.e., one of more Edge_I_Encoding's). TheNode_Identifier specifies the start of the description of an encodednode. Also, each encoded edge for a node specifies an incoming edge tothe node.

Each encoded edge is defined in terms of an identifier (Edge_Identifier)and a node index (Node_X_Index). The Edge_Identifier specifies the startof the description of an encoded edge, while the node index identifiesthe graph node from which the edge is coming. One of ordinary skill willrealize that other embodiments might use a schema that specifiesoutgoing edges of nodes as opposed to incoming edges. Only incoming oroutgoing edges need to be defined for each encoded node because thegraph is a directed one.

A more specific version of the schema described above stores each graphnodewise according to a certain numbering of the nodes from 0, . . . ,n−1. This schema encodes each graph as a bitstring. In this bit string,this schema uses a single “1” bit as the common node identifier(Node_Identifier) for each node in the graph, and a single “0” bit asthe common edge identifier (Edge_Identifier) for each edge of each node.Also, the node index for each edge is an integer that corresponds to thenumber assigned to the starting node for the edge.

Some embodiments store only sub-networks with fewer than 16nodes/network in the database. In these embodiments, it is possible toencode the index of a starting node of an edge with 4 bits. Therefore,the maximum number of bits that this scheme uses is provided by theequation below:Maximum number of bits=#nodes+#edges*(1+4).

A further reduction is achieved based on the following observation. Someembodiments require combinational-logic sub-networks to be acyclic,i.e., require certain ordering of the nodes such that any node “i” onlyhas ingoing edges from nodes “j”, where “j”<“i”. In other words, thisdenotes that the input of a certain node must not depend on the outputof this node. Computing such an ordering of the nodes requires lineartime only. From such an ordering, it follows that the starting node ofan edge incident on node “i” is within the range of 0, . . . , i−1.Accordingly, the node index of each starting node can be encoded (1)with 1 bit for nodes 0, . . . , 2, (2) with 2 bit for nodes 3, . . . ,4,(3) with 3 bits for nodes 5, . . . , 8 and (4) 4 bits for nodes 9, . . ., 15. For graphs with at most 8 nodes, this results in a furtherreduction by at least one bit/edge.

FIGS. 18-20 illustrate an example of the graph encoding scheme describedabove. FIG. 18 illustrates a graph of a sub-network. This graph includesthree nodes and seven edges. Also, in this graph, the three nodes aredefined as nodes 1, 2, and 3. Node 0 defines an “abstract” set of nodesfrom which the inputs to the sub-network originate. FIGS. 19 and 20illustrate a bitstring that represents the graph of FIG. 18 in anencoded fashion.

This graph encoding yields highly compressed structural descriptions ofsub-networks. For instance, if all networks have at most 8 nodes and onaverage 12 edges, then the average size of a graph encoding is bound by56 bits (i.e., 8+12×4). For a database with several million networks,empirically fewer than 65536 different graphs are needed, so that thetotal size of table 1515 is bound by 3670016 bits (i.e., 56 bits*65536graphs), which is about 450 kbyte.

One of ordinary skill will realize that other embodiments might useother encoding schemes. For instance, some embodiments might use Huffmanencoding or arithmetic encoding to encode the structure of each graph.

IV. Data Generator

FIG. 21 presents a process 2100 that conceptually illustrates severaloperations performed by the data generator 115 in some embodiments ofthe invention. As shown in this figure, the data generator specifies (at2105) numerous sub-networks. These sub-networks might includemulti-element sub-networks and/or multi-output sub-networks. For eachspecified sub-network, the data generator defines (at 2110) a parameterfor storing the sub-network. Based on each sub-network's definedparameter, the data generator stores the sub-network in a storagestructure 105.

As discussed above, some embodiments use a database as the storagestructure, and use indices into this database as the parameters forstoring the sub-networks. One of ordinary skill will realize that otherembodiments might use different storage structures (e.g., data files)and/or different storage parameters.

FIG. 22 illustrates a more specific process 2200 that the data generator115 performs in some embodiments to construct and organize the datatables 525. Before it starts, this process receives (1) a set ofcombinational-logic functions (e.g., Boolean functions) that are allowedto serve as local functions (i.e., as functions of nodes in thecombinational-logic sub-network), and (2) information specifying themaximum number of nodes and edges of the sub-networks that are to beconstructed. The set of combinational-logic functions can be received inone or more function tables 1520, or can be organized in one or morefunction tables 1520 by the process 2200. Also, in the embodimentsdescribed below, each received function is expressed in terms of itsROBDD.

The received set of combinational-logic functions is called acombinational-logic library. In some embodiments, this library istypically derived from an existing technology library that contains allcircuit elements that may be used for a specific technology. In thissituation, the combinational-logic library contains allcombinational-logic functions that can be computed by a single circuitelement within this technology library. Moreover, it contains additionalinformation relating to the physical implementation of the correspondingcircuit element (e.g., estimated size of the circuit element, estimatedpower consumption, timing characteristics, etc.). Such additionalinformation can be used in the various network filters in order toconstruct databases that contain only networks with specificcharacteristics.

In other embodiments, the combinational-logic library includes functionsthat are not all from one specific technology library. For instance, fora database that is not bound to a specific technology library, thecombinational-logic library can include local functions from one or moretechnology libraries and/or additional abstract functions that are notfrom any particular technology library. The artificially definedfunctions correspond to artificially defined circuit elements that haveartificially defined physical characteristics. By using acombinational-logic library that includes arbitrary local function, itis possible to construct a database free from any set of specifictechnology characteristics. Some embodiments require, however, that theset of functions be complete so that the generator can generate most, ifnot all, combinational-logic functions. Other embodiments, on the otherhand, do not impose this requirement.

As shown in FIG. 22, the process 2200 initially generates (at 2202)numerous directed graphs up to the given maximum number of nodes andedges. The process 2200 in some embodiment generates graphs with at most8 nodes and at most 16 edges, because the number of graphs growsexponentially with the number of nodes and edges. The pseudo code belowillustrates how some embodiments generate such directed graphs.

-   -   For node_number=1, . . . , max_node_number        -   For edge_number=1, . . . , max_edge_number            -   Construct all graphs that have node_number nodes and                edge_number edges and that have a top node that is                dependent on all input variables,            -   For each constructed graph, if graph is not isomorphic                to any previously saved graphs and if graph is cycle                free, then encode graph and save encoded graph in a list    -   Construct the Graph Table 1515 from the list of saved graphs.

As indicated in the pseudo code above, the process 2200 in someembodiments performs 2202 by initially enumerating all combinations ofnodes and edges. For each combination of nodes and edges, the processthen constructs each graph that has the number of nodes and edges in thecombination and that has at least one pivot node. As illustrated in FIG.23, a pivot node is a node that has its topological fan-in cone receiveall the input variables.

Numerous known techniques can be used to construct all graphs for agiven number of nodes and edges. Some embodiments construct initiallyall undirected graphs for a given number of nodes and edges. There aresoftware packages available for constructing all undirected graphs. Onesuch package is the “geng” program package by Brendan D. Mckay (who maybe contacted by email), Computer Science Department, Australian NationalUniversity. This package can be downloaded through the internet.

After generating all undirected graphs, these embodiments generate alldirected graphs by trying all possible assignments for directions on alledges of each graph. After constructing all directed graphs for eachcombination of nodes and edges, the process discards all cyclic graphsfor the combination, and then stores each remaining graph in the graphtable so long as the graph is not isomorphic to a previously storedgraph.

Checking for cycles and identifying isomorphic graphs is commonly knownin the art. For instance, Cormen, Leiserson, Rivest and Stein,Introduction to Algorithms, Second Edition, Chapter 22 (Elementary GraphAlgorithms), MIT Press 2001 discloses one manner of checking a graph forcycles by traversing the graph. In addition, there are software packagesavailable for identifying isomorphic graphs. One such package is the“nauty” package by Brendan D. Mckay (who may be contacted by email),Computer Science Department, Australian National University. Thispackage can be downloaded through the internet.

For each graph that the process 2200 stores in the graph table, theprocess assigns and stores a graph-table index. The graphs are stored inthe graph table index in a particular order specified by theirgraph-table indices (e.g., are sorted by increasing indices).

After 2202, the process 2200 selects (at 2204) one of the graphs fromthe graph table. It then constructs (at 2206) all combinational-logicsub-networks that can be derived from the selected graph. The processconstructs each sub-network for the selected graph by assigning a uniqueset of functions to the set of nodes for the graph. Any function of thecombinational-logic library can be assigned to any node of the graph solong as the node is suitable for the function. A node is suitable if thenumber of ingoing edges of this node is equal to the number of variableson which the function depends. In some embodiments, each sub-networkspecified at 2206 is expressed temporarily in terms of its graph and theset of local functions for the nodes of this graph.

The process next selects (at 2208) a sub-network identified at 2206. Theprocess then computes (at 2210) all output functions that the selectedsub-network realizes (i.e., identifies a function for each output of thesub-network). In some embodiments, the process defines the output ofeach node of the sub-network as an output of the sub-network. Also, theoutput function at a particular node's output is the particular node'slocal function when the particular node does not receive the output ofany other node in the graph. Alternatively, when the particular nodereceives the output of another node or other nodes in the graph, anoutput function at the particular node's output can be derived from theparticular node's local function and the local function of each nodewhose output the particular node receives.

In some embodiments, each sub-network specified at 2210 is expressedjust in terms of its set of output functions. This is because at thisstage the sub-networks are constructed only to create the hashingfunction. Also, at this stage, each output function is expressed interms of its ROBDD. Also, at 2210, the process identifies any outputfunction of a sub-network that can serve as a pivot function. Asmentioned above, a sub-network's pivot function is a function that isdependent on all the inputs to the sub-network.

The process then (at 2212) applies a set of filtering rules to theselected sub-network and discards (i.e., filters) the selectedsub-network if the selected sub-network falls into any one of theserules. In different embodiments, the process uses different sets ofrules to filter the sub-networks. For instance, in some embodiments, theprocess 2200 discards a sub-network when (1) the sub-network hasduplicate output functions, (2) it has an output function that isidentical to one of the sub-network's inputs, or (3) it does not have apivot output function. Even though each generated graph has a pivotnode, the resulting sub-networks might not have pivot functions becausesome of the input variables might drop out as a result of the particularfunctions implemented by the sub-networks.

The process also discards a sub-network when the sub-network has atleast one node with an output that is not dependent on all the inputsthat are fed into the node's topological fan-in cone. For instance, FIG.24 illustrates a three-node sub-network that includes a first node 2405that receives the outputs of second and third nodes 2410 and 2415, wherethe second node receives first and second input signals, and the thirdnode receives third and fourth input signals. In such a sub-network, theoutput of each node must depend on its inputs, and the output of thefirst node must depend on the first through fourth inputs, according tothe above-described filtering rule.

The process next determines (at 2214) whether it has examined all thesub-networks generated at 2206. If not, it transitions back to 2208 toselect an unexamined sub-network. Otherwise, it determines (at 2216)whether it has examined all the graphs generated at 2202. If it has notexamined all the graphs, it transitions to 2204 to select an unexaminedgraph.

If the process has examined all the graphs, it transitions from 2216 to2218. By the time the process reaches 2218, it has identified andretained numerous sub-networks. Each retained sub-network is specifiedby a set of output functions that includes one or more functions thatcan serve as pivot functions. Each sub-network's function set might alsoinclude one or more functions that cannot serve as pivot functions.

Next, the process uses (at 2218-2226) the generated sets of outputfunctions of the sub-networks to generate a hashing function.Specifically, at 2218, the process selects one set of output functions(which represents one sub-network) that was not discarded at 2212 (i.e.,that remains after 2216).

From this selected set, the process defines (at 2220) one or more setsof functions, with each defined set having one pivot function andpotentially one or more non-pivot functions (a set does not have anon-pivot function if it only has one function). At 2220, the processdefines as many sets of functions as there are potential pivot functionsin the set selected at 2218. For instance, the set selected at 2218might have five functions (F1, F2, F3, F4, and F5), of which only thesecond and third (F2 and F3) ones can serve as pivot functions. Fromsuch a set, the process defines (at 2220) two sets of functions, onewith the second function as the pivot function and the rest as non-pivotfunctions, and the other with the third function as the pivot functionand the rest as non-pivot functions.

For each of the function sets identified at 2220, the process 2200directs the indexer to generate a truthtable representation of eachfunction in the set based on an input variable ordering that isidentified from the pivot function. As described above, some embodimentsselect the input ordering that leads to the canonical truthtablerepresentation of the pivot function. In some embodiments, the canonicaltruthtable is the smallest-valued truthtable.

When several input variable orderings lead to the pivot function'scanonical truthtable representation during database generation, theindexer's process 800 deterministically selects the input ordering.Specifically, as mentioned above, the indexer's input order identifieralways returns (at 810) the set of input variable configurations in aparticular order. When the indexer operates deterministically duringpre-tabulation, the index manager always selects (at 815) the same inputconfiguration in the returned set of input configurations as thedesignated input configuration. For instance, in some embodiments, theindex manager always selects the first input configuration in thereturned set as the designated configuration.

The process then determines (at 2222) whether it has examined allsub-networks that were retained at 2212 (i.e., whether it has examinedall sets of functions that remained after 2216). If not, the processreturns to 2218 to select another set of output functions (whichrepresents another sub-network) that was not discarded at 2212.

Otherwise, the process generates (at 2224) a hash table by associatingthe truthtable of each particular function in each set defined at 2220with a particular index value. The process then computes a hashingfunction (at 2226) based on the generated hash table. One such manner ofgenerating a hash table and function is described in Czech's paper,which was cited above.

After 2226, the process essentially executes all of the operations2204-2216 once again. Specifically, at 2228, the process selects one ofthe graphs from the graph table generated at 2202. It then constructs(at 2230) all combinational-logic sub-networks that can be derived fromthe selected graph. The process constructs each sub-network in the samemanner as described above for 2206.

The process next selects (at 2232) a sub-network identified at 2230. Theprocess then (at 2234) computes all output functions that the selectedsub-network realizes (i.e., identifies a function for each output of thesub-network) and identifies each of the sub-network's output functionsthat can serve as a pivot function. The operation of the process at 2234is similar to its operation at 2210, except that at 2234 the processspecifies each sub-network not only in terms of its identified outputfunctions but also in terms of its graph. At this stage, each outputfunction is expressed in terms of its ROBDD.

The process then applies (at 2236) a set of filtering rules to theselected sub-network and discards (i.e., filters) the selectedsub-network if the selected sub-network satisfies any one of theserules. The process applies the same filtering rules at 2236 that itapplied at 2212. The process next determines (at 2238) whether it hasexamined all the sub-networks generated at 2230. If not, it transitionsback to 2232 to select an unexamined sub-network. Otherwise, for eachsub-network that the process retained at 2236, it defines (at 2240) oneor more sets of output functions, with each set having one pivotfunction and potentially one or more non-pivot functions. As in 2220,the process at 2240 defines as many sets of functions for eachparticular sub-network as there are potential pivot functions in thefunction set specified at 2234 for the particular sub-network.

For each set of functions specified at 2240 for a particularsub-network, the process also generates (at 2240) a set of indices forthe particular sub-network. Each set of indices includes one pivot indexand potentially one or more secondary indices; a set does not includeany secondary indices if it was generated for a function set with onlyone function.

At 2240, the process directs the indexer to generate a truthtablerepresentation of each function in the set based on an input variableordering that is deterministically selected from the canonicaltruthtable representation of the pivot function. The deterministicselection of the input variable ordering at 2240 is identical to thedeterministic selection of the input ordering at 2220. Specifically, theindexer's input order identifier always returns (at 810) the set ofinput variable configurations in a particular order, and the indexmanager during pre-tabulation always selects (at 815) the same inputconfiguration in the returned set of input configurations as the inputconfiguration. After generating the truthtable representation of eachfunction, the process generates (at 2240) the indices by using thehashing function generated at 2226.

At 2240, the process also assigns a network table index to eachsub-network that was retained at 2236. For each retained sub-network,the process stores (at 2240) in a list the sub-network's graph tableindex (which specifies its graph in the graph table), one or morefunction table indices (each specifying a local function in the functiontable), one or more sets of function indices (which were defined at2240), and a generated network table index.

After 2240, the process determines (at 2242) whether it has examined allthe graphs generated at 2202. If it has not examined all the graphs, ittransitions to 2228 to select an unexamined graph. Otherwise, itdiscards (at 2244) multiple definitions of the same network or nearlythe same networks. This is done by deleting all but one out of eachgroup of generated sub-networks that have (1) the same graph tableindex, and (2) the same sets of function indices (defined at 2240). Suchduplicate networks may appear for example because of symmetries of thegraph structure. Based on the list of network table indices and functionindices, the process then completes (at 2246) the database tables 525.Specifically, the process first creates the network table 1510, then thesecondary index table 1505, and then the pivot-index table 1500. Asmentioned above, the network table is sorted in an order specified byits stored network indices, the secondary table is sorted in an orderspecified by its stored primary and secondary indices, and the primarytable 1500 is sorted in an order specified by its stored primaryindices. After 2246, the process ends.

The process 2200 generates a large number of combinational-logicsub-networks up to a specified size. This is radically different fromprevious approaches that work with network transformations from a smallset of possibilities that are derived from expert knowledge. This newapproach enables optimization processes to identify numerous sub-networkalternatives through a simple and direct lookup into a database. Incontrast, hand-coded transformations (whether directly implemented asprogram code or used as parametrizable input-rules to a logic synthesissystem) can exploit only a small subset of possible implementations.Moreover, the above-described, automated database approach makes itpossible to integrate expert knowledge in addition to the machinegenerated networks. In other words, this approach provides the abilityto add sub-networks to the database with arbitrary circuit elements.

V. Technology Mapping

The data storage-driven synthesis described above can also be used toperform technology mapping. Some current logic synthesis systems performthe following three operations. First, they perform logic optimizationon unbounded networks that contain circuit elements that performarbitrary local functions. Second, they map the optimized network into asub-optimal network that is made of simple local functions (like2-way-NANDs, etc.). Third, they map the sub-optimal network into abounded network that is bounded to a particular technology library.These systems follow this approach because the powerfullogic-optimization techniques work best on unbounded circuit networks.Technology mapping operations do not always map the sub-network into anetwork that is made of simple local functions (like 2-way-NANDs, etc.).This operation simply makes the mapping to the technology librarysimpler.

There are other similar technology mapping systems. For instance, somesystems perform the following three operations. First, they map anunbound network, which contains circuit elements that perform arbitrarylocal functions, into another network that is made of simple localfunctions (like 2-way-NANDs, etc.). Second, they perform logicoptimization on the mapped network. Third, these systems map theoptimized network into a network that is bound to a particulartechnology library.

Under current technology mapping techniques, additional mappingoperations have to be performed to map into the target library in orderto ensure the manufacturability of the final network. These mappingoperations need to take into account physical implementation details(such as timing and size of the gates). These mapping operations areusually based on different algorithms from those used for theoptimization operation. Therefore, there is the risk of gettingsub-optimal results from the algorithmic differences between theoptimization and technology mapping operations.

One type of technology mapping operation is tree mapping. Typical treemapping operations use recursive dynamic programming techniques that (1)select small sub-networks, (2) in each selected sub-network, identifymicro-trees or micro-leaf DAG's with one or more elements, and then (3)map each micro-tree or micro-leaf DAG to one gate in the technologylibrary. Other tree mapping operations identify micro-trees directlyfrom the network, and then map each micro-tree to one gate in thetechnology library.

A tree includes N nodes and N−1 edges. A micro-tree is a tree with a fewnodes. A DAG is a directed acyclic graph. A DAG can have any number ofedges. A micro-leaf DAG is similar to a micro-tree except that its leafnodes can have outputs that connect to more than one other node.

The tree-mapping operations select micro-tree or micro-leaf DAGsub-networks that have only one output, which is the output at the rootnode (top-level node) of the micro-tree or micro-leaf DAG. Theseoperations do not select any graph structure that has lower-level nodes(i.e., its non-root nodes) that have outputs outside of the tree (i.e.,have fan-outs to nodes outside of the tree). In other words, each graphselected by a typical tree-mapping operation has a set of fan-ins (i.e.,one or more inputs) and only one fan-out (i.e., only one output).

Another previous technology mapping operation is structural mapping.Structural mappers typically use inverters and a single other type ofgate (such as a two-input NAND or NOR) to remap the optimized network toa sub-optimal network. Under this approach, the local function of a nodeis entirely determined by the number of inputs of the node (e.g., aone-input-node is an inverter, while all other nodes are the chosen basefunction, such as a NAND). Therefore, the combinational-logic functionsthat are realized by any sub-network are entirely defined solely by itsgraph structure.

Structural mappers also typically remap each element in the technologylibrary into a graph structure by using inverters and the same type ofgate used to remap the optimized network. Once each member of thetechnology library is represented as some graph, the structural mappingprocess then partitions the sub-optimal network into sub-parts. For eachsub-part, it then tries to identify a graph structure (1) thatrepresents a single element in the target library that corresponds tothe sub-part's graph structure, and (2) that is as good as possibleaccording to specific cost functions (area, timing, power). Thepartitioning process is often performed using string matching techniquessimilar to the ones that are used within compilers that translatehigh-level programming languages to low-level machine languages.

Though structural mapping defines a general framework to solve thelibrary mapping task, it has several practical disadvantages. Efficientalgorithms are known only for tree structures. More general graphstructures have to be decomposed heuristically into simple treestructures. This decomposition makes the quality of structural mappingsuffer from this artificial decomposition.

The invention's data storage-driven optimization does not suffer fromany of these deficiencies. The candidate sub-networks that it maps to aparticular technology do not need to have tree structures, but rathercan have more general directed acyclic graph (“DAG”) structures. Thereis no restriction on the number of edges that these structures can have.Also, each candidate sub-network can have multiple output nodes.Accordingly, the candidate sub-networks can be larger, as their internalnodes can have fan-outs to nodes outside of the sub-networks. Inaddition, the data storage-driven optimization can explore large numberof pre-tabulated sub-networks based on their functions. It does notrequire inefficient mapping to simple gates. Also, it can retrieve amulti-element replacement sub-network in a single operation based onthis sub-networks set of output functions. Moreover, for a particularset of local functions, it can pre-tabulate all sub-networks up to aspecified level of complexity.

FIG. 25 illustrates a process 2500 for performing technology mappingusing the invention's data storage-driven optimization. This processstarts each time it receives (1) a circuit network that is not designedfor a specific technology library, and (2) a database (or other storagestructure) that contains pre-tabulated sub-networks that are bound tothe specific technology library. The database can be pre-tabulated basedon the approach described above in Section IV. Once the process 2500receives the circuit network and the database, it optimizes the circuitnetwork for the specific technology library.

As shown in FIG. 25, the process initially uses (at 2505) the receiveddatabase to perform process 200 for the received circuit network. Theprocess 200 continuously changes the received network so that more andmore parts of it are bound to the target library (as each replacementsub-network is bound to the target library). Some embodiments might usethe process 200 slightly differently when they use it as part of theprocess 2500 for technology mapping. So long as the query managerreturns, for a selected candidate sub-network, at least one replacementsub-network from the data storage that stores the technology-boundsub-networks, these embodiments will always find (at 230) one of thereplacement sub-networks acceptable, and thereby will always exchange(at 235) the selected candidate sub-network with one of the returnedreplacement sub-networks. Other embodiments, however, might stillevaluate (at 230) whether to replace a selected candidate sub-networkwith one of the retrieved replacement sub-networks, and might notreplace candidate sub-networks with replacement sub-networks in someinstances. Also, when the query manager returns more than onereplacement sub-network for a candidate sub-network, the process 200, asdescribed above, selects one of the replacement sub-networks randomly orselects a replacement sub-network that has the best cost (which wascomputed at 225).

Once the process 200 reaches its stop criteria (e.g., performs a maximumnumber of iterations or reaches the stopping criteria of theoptimization algorithm, such as the annealer), the process 2500transitions to 2510. At 2510, the process 2500 traverses the circuitnetwork that exists after 2505 in order to identify any node (i.e., anycircuit element) in this network that is potentially not bound to thetarget library. Some embodiments regard a node as potentially unbound ifit was not added to the circuit network within any exchange stepperformed in 2505. For any node N that is potentially not bound to agate in the target library, the process 2500 treats the node N as aone-node candidate sub-network and accordingly finds all matchingsub-networks in the database that realize the function of the node N.The process 2500 then replaces the node N with the best suitablereplacement for this node. The process 2500 continues its search throughthe circuit network until it ensures that it contains any circuitelement that is not bound to the technology library.

It should be noted that finding some replacement in the database for anunbound node is always possible as long as the local function of thenode is not too complex. In the majority of cases, this criterion iseasily satisfied. However, in cases where a network has an arbitrarilycomplex node function, the process 2500 needs to use other methods(e.g., Boolean decomposition by Shannon expansion as described above)first to simplify the unmanageable nodes and then to map thesimplification to the target library. In other words, the process 2500needs to decompose the complex node function into a set of smallerfunctions, and then identify a set of replacement sub-networks thatperform the set of smaller functions.

After 2510, the process 2515 uses (at 2505) the received database toperform process 200 again on the circuit network that remains after2510. The process 200 then optimizes this network again for the targetlibrary. This optimization can be viewed as a clean-up operation thatrectifies any sub-optimal networks that resulted from the forcedexchanges at 2510.

The technology mapping operation of process 2500 can be advantageous inmany contexts. It provides superior mapping of networks that are notbound to any target library to a particular target library. Also, itprovides superior mapping of networks from one bound target library toanother. This can be highly advantageous when mapping from onetechnology (e.g., 0.13 micron technology) to another technology (e.g.,0.1 micron technology).

VI. NPN-Equivalence

As discussed above by reference to FIGS. 10-13, some embodiments treatas equivalent two sub-networks that can be made identical by permutingthe set of inputs of one of the sub-networks. By using this firstequivalence relationship, these embodiments can reduce the number ofequivalent sub-networks that are stored in the data storage.

Some of these embodiments establish two other equivalence relationshipsto reduce the number of stored equivalent sub-networks. First, theseembodiments treat as equivalent two sub-networks that can be madeidentical by inverting any subset of the inputs (i.e., one, several, orall of the inputs) of one of the sub-networks. Second, they treat asequivalent two sub-networks that can be made identical by inverting anysubset of the outputs (i.e., one, several, or all of the outputs) of oneof the sub-networks. Accordingly, in these embodiments, two sub-networksare treated as equivalent whenever they can be made identical by anycombination of one or more of the following transformations: (1)inverting a particular subset of the inputs, (2) permuting the set ofinputs, and (3) inverting a particular subset of the output. Twosub-networks are identical when they perform the same output functionsand have the same graph.

The term NPN equivalence refers to all three equivalent relationshipsdescribed above. In this term, the first “N” refers to the inversion ofa subset of the inputs, the “P” refers to the permuting of the inputset, and the second “N” refers to the inversion of a subset of theoutput set.

Both N equivalences are based on the assumption that the inversion ofsignals is cost-free (i.e., any signal in a design may be inverted by anadditional inverter with no costs—area, timing, etc.). There are severalreasons for this assumption. First, during early optimization, it may besuitable to drop the inversion issue and just focus on the generalrestructuring of the circuit description, since at this stage the finalproperties of the circuit description can only roughly be inferred.Second, some technologies by default provide output signals in bothpolarities (i.e., for each output pin “A,” they have a pin “A_BAR” thanprovides the complemented signal). Third, by having no cost oninversions, the storage of pre-tabulated sub-networks can be made muchmore powerful, since more complex sub-networks can now be stored withinthe same amount of memory. This results in more powerful optimization asit enables exchange of more complex sub-networks.

The embodiments described above account for the P equivalence (i.e.,account for the permuting of the input set). To account for the two Nequivalences (i.e., to account for the equivalence due to the inversionof the input and/or output), the following operations have to bemodified.

Process 800 at 810 and 815.

To account for NPN equivalence, the canonicalization operation 810 ofprocess 800 of FIG. 8 needs to be modified. For the truthtablerepresentation of a combinational-logic function, the canonicalizationcan be made NPN-aware based on the technique disclosed in “BooleanMatching for Large Libraries” by Uwe Hinsberger and Reiner Kolla, DAC98,Jun. 15-19, 1998. This technique uses a similar branch and boundalgorithm as the P-case. N-aware canonicalization for othercombinational-logic function representations are disclosed in otherreferences, such as “Efficient Boolean Function Matching”, Jerry R.Burch and David E. Long, Proc. ICCAD 1992.

NPN-canonicalization of combinational-logic functions F_(—)1 and F_(—)2results in the same representation whenever F_(—)1 and F_(—)2 can bemade equal by a combination of (1) switching one or more input phases,(2) permuting inputs, and (3) switching one or more output phases. Forthe truthtable representation of a function F, the process ofNPN-canonicalization for the function F identifies (at 810) one or moretransformation sets. Each identified transformation set consists of someinput switching, input permutation, output switching—in this order—andleads to the canonical truthtable representation of the pivot function.Specifically, each identified set of NPN transformations T specifies (1)an input variable configuration P, (2) a subset Z of the input variablesto be switched, and (3) a Boolean variable O that indicates whether thewhole function should be complemented or not.

At 810, the process 800 (at 810) identifies the NPN-canonicalrepresentation of the function F. This canonical representation is thetruthtable that the function F produces after accounting for thetransformations specified by one of the sets identified at 810 (i.e.,after accounting for the set's input variable inversion Z, thenaccounting for the set's input variable configuration P, and thenaccounting for the set's output variable inversion O).

Instead of selecting only an input variable configuration at 815, theprocess 800 when running in an NPN-mode selects (at 815) an NPNtransformation set. This transformation set is one of the onesidentified at 810. The process 800 randomly selects this set to generateindices of a query during optimization. On the other hand, itdeterministically selects this NPN transformation during pre-tabulation.

Process 900 at 910 and 915.

In NPN-mode, the process 900 of FIG. 9 receives (at 910) theNPN-transformation set selected at 815. The process 900 of FIG. 9applies the N and the P transformations in the transformation setselected at 815 first to non-pivot functions of a multi-function query.Specifically, it applies the set's input variable inversion Z to anon-pivot function. It then applies the set's input variableconfiguration P after applying the set's input variable inversion Z.Next, it takes the resulting truthtable (“non-inverted truthtable”) forthe non-pivot function and inverts it to obtain an inverted truthtable.The process then examines the inverted and non-inverted truthtables asbinary-number strings and selects the smaller one of the non-invertedtruthtables and the inverted truthtable. The process then computes thehashed value based on the selected truthtable.

Process 1400 at 1415-35.

In NPN mode, the process 1400 determines an NPN transformation set T(Z,P,O) in a deterministic manner by selecting the first transformationspecified in the list of returned transformations. At 1420, the processapplies the inverse of the transformation X=(W, V, Y) that is used at605 (i.e., it will use X⁻¹(T)). At 1425, the process computes thefunctions based on the reordering and input phase switching identifiedin 1420.

At 1430, the process checks whether the computed set includes the outputfunction set of the candidate sub-network by additionally assuming twofunctions to be equal if one is the complement of the other. This isbecause the inversion of each individual output function of thereplacement sub-network may be done in addition at no extra cost. If itdoes specify a match at 1435 all the necessary inverters (i.e.,additional nodes with a single input that performs simplycomplementation of its input function) that realize the underlying inputand output phase switches will be added to the replacement sub-network.

Process 2200 at 2220 and 2240

At 2220 or 2240, the process 22 deterministically identifies the NPNtransformation (Z,P,O) based on the pivot function. At 2220 or 2240, theprocess 2200 then generates a truthtable for the pivot functionaccording to the identified transformation. It also generates atruthtable for each non-pivot function of the set at 2220 or 2240. Togenerate the truthtable for a non-pivot function, it applies the set'sinput variable inversion Z to the non-pivot function. It then appliesthe set's input variable configuration P after applying the set's inputvariable inversion Z. Next, it takes the resulting truthtable(“non-inverted truthtable”) for the non-pivot function and inverts it toobtain an inverted truthtable. The process then examines the invertedand non-inverted truthtables as binary-number strings, and selects thesmaller one of the non-inverted truthtable and the inverted truthtable.

As mentioned further below, some embodiments do not use NPN-equivalencewith technology mapping. Accordingly, the process 2500 of FIG. 25 is notmodified.

Using the NPN-equivalence relationships establishes many moreequivalences between the pre-computed sub-networks than justP-equivalence. Accordingly, the use of NPN-equivalence results in alarge number of sub-networks and index sets being removed at 2242. Thisreduces the number of sub-networks in the data storage significantly andfacilitates the creation of data storage with more complex sub-networks.

The size of the data storage can be reduced by not consideringsub-networks that contain an explicit inverter. If the inverter drivesinputs of other nodes of the sub-network, then it can be assumed thatthe node itself complements the input signal (at no cost). If theinverter drives a sub-network output O, then this is consideredlogically equivalent with a sub-network that generates the samefunctions except that the function at O is complemented. The embodimentsthat do not consider buffers (i.e., consider single-input/single-outputnodes that just present the input signal at the output) for eitherP-only equivalence or NPN-equivalence, do not consider graphs with anynode that has only one input during pre-tabulation. This reduces thenumber of sub-networks in the data storage significantly and facilitatesthe creation of data storage with more complex sub-networks.

Accounting for NPN-equivalence is usually an optimization that is donefor unbound combinational-logic networks. Optimizing boundcombinational-logic networks or mapping unbound combinational-logicnetworks to bound ones is usually accomplished by being fully aware ofphysical characteristics of the gates of the target library (incosting). Inversion is usually not cost free with respect to suchdetailed physical characteristics. Accordingly, neither N-equivalence isused during technology mapping. Only P-equivalence is considered duringtechnology mapping.

One of ordinary skill will realize that other embodiments might justaccount for NP-equivalence or PN-equivalence (i.e., might consider justinput or just output inversion to be cost free). In these circumstances,the canonicalization is just NP or PN canonicalization. The PN-case maybe interesting even for bound combinational-logic networks as sometechnology libraries indeed always provide output pins for bothpolarities of the signal.

VII. The Computer System

FIG. 26 presents a computer system with which one embodiment of thepresent invention is implemented. Computer system 2600 includes a bus2605, a processor 2610, a system memory 2615, a read-only memory 2620, apermanent storage device 2625, input devices 2630, and output devices2635.

The bus 2605 collectively represents all system, peripheral, and chipsetbuses that communicatively connect the numerous internal devices of thecomputer system 2600. For instance, the bus 2605 communicativelyconnects the processor 2610 with the read-only memory 2620, the systemmemory 2615, and the permanent storage device 2625.

From these various memory units, the processor 2610 retrievesinstructions to execute and data to process in order to execute theprocesses of the invention. The read-only-memory (ROM) 2620 storesstatic data and instructions that are needed by the processor 2610 andother modules of the computer system. The permanent storage device 2625,on the other hand, is read-and-write memory device. This device is anon-volatile memory unit that stores instruction and data even when thecomputer system 2600 is off. Some embodiments of the invention use amass-storage device (such as a magnetic or optical disk and itscorresponding disk drive) as the permanent storage device 2625. Otherembodiments use a removable storage device (such as a floppy disk orzip® disk, and its corresponding disk drive) as the permanent storagedevice.

Like the permanent storage device 2625, the system memory 2615 is aread-and-write memory device. However, unlike storage device 2625, thesystem memory is a volatile read-and-write memory, such as a randomaccess memory. The system memory stores some of the instructions anddata that the processor needs at runtime. In some embodiments, theinvention's processes are stored in the system memory 2615, thepermanent storage device 2625, and/or the read-only memory 2620.

The bus 2605 also connects to the input and output devices 2630 and2635. The input devices enable the user to communicate information andselect commands to the computer system. The input devices 2630 includealphanumeric keyboards and cursor-controllers. The output devices 2635display images generated by the computer system. For instance, thesedevices display IC design layouts. The output devices include printersand display devices, such as cathode ray tubes (CRT) or liquid crystaldisplays (LCD).

Finally, as shown in FIG. 26, bus 2605 also couples computer 2600 to anetwork 2665 through a network adapter (not shown). In this manner, thecomputer can be a part of a network of computers (such as a local areanetwork (“LAN”), a wide area network (“WAN”), or an Intranet) or anetwork of networks (such as the Internet). Any or all of the componentsof computer system 2600 may be used in conjunction with the invention.However, one of ordinary skill in the art would appreciate that anyother system configuration may also be used in conjunction with thepresent invention.

While the invention has been described with reference to numerousspecific details, one of ordinary skill in the art will recognize thatthe invention can be embodied in other specific forms without departingfrom the spirit of the invention. For instance, some embodiments mightstore sub-networks differently than the storage-scheme described above.Also, some embodiments might use databases and data storages that arenot machine generated. In addition, some embodiments might use differentencoding and indexing schemes than those described above. Thus, one ofordinary skill in the art would understand that the invention is not tobe limited by the foregoing illustrative details, but rather is to bedefined by the appended claims.

1. For a candidate sub-network that performs multiple output functionsthat are produced outside the candidate sub-network in an integratedcircuit (IC) design, a method for retrieving a replacement sub-networkfor the multi-output function candidate sub-network, the methodcomprising: a) generating a parameter by: (i) identifying one of theoutput functions of the candidate sub-network as a first outputfunction; (ii) using the first output function as a reference from whichto derive an index for each output function of the candidatesub-network; (iii) deriving the parameter from the indices for theoutput functions of the candidate sub-network; b) using the parameter toretrieve a replacement sub-network from a storage structure that storesreplacement sub-networks; and c) replacing the candidate sub-network inthe design with the replacement sub-network.
 2. The method of claim 1further comprising: before generating the parameter, selecting thecandidate sub-network from the design.
 3. The method of claim 1, whereinthe index of the first output function is a primary index, and the indexof each non-first output function is a secondary index, wherein usingthe parameter comprises: for each particular index pair formed by theprimary index and one of the secondary indices, identifying eachreplacement sub-network stored in the storage structure that isassociated with the particular index pair; determining whether any ofthe identified replacement sub-networks are associated with all theindex pairs; and retrieving any identified replacement sub-network thatis associated with all index pairs.
 4. The method of claim 1, whereinthe candidate sub-network has a set of input variables, whereingenerating the parameter further comprises: using the first outputfunction to specify a configuration for the input variables; and basedon the specified input-variable configuration, specifying an index foreach output function.
 5. The method of claim 4, wherein using the firstoutput function to specify an input-variable configuration comprises:identifying a canonic representation for the plurality of outputfunctions based on the specified input-variable order of the firstoutput function.
 6. The method of claim 5, further comprising:generating a truthtable representation of the first output function;wherein identifying the canonic representation includes identifying acanonic representation of the truthtable representation of the firstoutput function.
 7. The method of claim 5 further comprising: condensingthe canonic representation to obtain a condensed representation of thefirst output function.
 8. The method of claim 5 further comprising:specifying a condensed representation of each non-first output functionbased on the selected input-variable configuration.
 9. The method ofclaim 1, wherein using the parameter to retrieve the replacementsub-network comprises using the index derived from the first outputfunction of the candidate sub-network to identify a first set ofreplacement sub-networks having a first output function that produces asame output as the first output function of the candidate sub-network.10. The method of claim 9, wherein using the parameter to retrieve thereplacement sub-network further comprises using the index derived from asecond output function of the candidate sub-network to identify areplacement sub-network having a second output function that produces anoutput of the second output function of the candidate sub-network fromthe first set of replacement sub-networks.
 11. The method of claim 1,wherein using the parameter to retrieve the replacement sub-networkcomprises searching only replacement sub-networks in the storagestructure that have a first output function producing a first output ofthe first output function of the candidate sub-network based on an indexderived from the first output function and not searching otherreplacement sub-networks within the storage structure.
 12. A computerreadable storage medium storing a computer program for retrieving areplacement sub-network for a multi-output function candidatesub-network in a design, the computer program comprising sets ofinstructions for: a) generating a parameter comprising at least a firstindex for a first output function that produces a first output outsidethe candidate sub-network and a second index for a second outputfunction that produces a second output outside the candidatesub-network, wherein generating the parameter comprises: (i) identifyinga particular input ordering for one of the output functions of thecandidate sub-network as a first output function that generates thefirst index; (ii) using the particular input ordering for the firstfunction to specify an index for each output function of the candidatesub-network; and (iii) deriving the parameter from the indices for theoutput functions of the candidate sub-network; b) using the parameter toretrieve a replacement sub-network from a storage structure that storesreplacement sub-networks; and c) replacing the candidate sub-network inthe design with the replacement sub-network.
 13. The computer readablestorage medium of claim 12, further comprising sets of instructions for:before generating the parameter, selecting the candidate sub-networkfrom the design.
 14. The computer readable storage medium of claim 12,wherein the index of the first output function is a primary index, andthe index of each non-first output function is a secondary index,wherein using the parameter comprises sets of instructions for: for eachparticular index pair formed by the primary index and one of thesecondary indices, identifying each replacement sub-network stored inthe storage structure that is associated with the particular index pair;determining whether any of the identified replacement sub-networks areassociated with all the index pairs; and retrieving any identifiedreplacement sub-network that is associated with all index pairs.
 15. Thecomputer readable storage medium of claim 12, wherein said parameter isan amalgamation of the specified indices for the output functions of thecandidate sub-network.